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The purpose of this research is to extend these techniques to 
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Volterra series models for nonlinear systems . A nonlinear exten- 
sion of the ARMA model is also considered and is shown in some 
cases to remedy problems encountered in Volterra modeling of non- 
linear systems. Lattice methods are also developed for the non- 
linear ARMA model and it is shown that the methods obtained for 
linear ARMA modeling follow as a special case of the nonlinear 
results . 

Experimental verification of the methods developed for single 
channel linear ARMA modeling is presented and used to explore the 
characteristics of the lattice solution techniques. The results 
clearly indicate that the lattice methods are extremely powerful, 
capable of producing highly accurate system models using short 
runs of data. 
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The problem of obtaining parametric models for linear 
and nonlinear systems based on observations of the input 
and output of the system is one of wide ranging interest. 

For linear systems, moving average (HA) and autoregressive 
(AR) models have received considerable attention and, based 
on the Levinson algorithm, a number of very powerful methods 
involving lattice filter structures have been developed to 
obtain the model solutions. For nonlinear systems the 
Volterra series model which is a nonlinear extension of the 
moving average model is frequently used. 

The purpose of this research is to extend these tech- 
niques to more general linear and nonlinear models. Using 
the equation error formulation, lattice solution methods 
in batch processing and adaptive form are developed for both 
single and multichannel autoregressive moving average (ARMA) 
models for linear systems and Volterra series models for 
nonlinear systems. A nonlinear extension of the ARMA model 
is also considered and is shown in some cases to remedy 
problems encountered in Volterra modeling of nonlinear sys- 
tems. Lattice methods are also developed for the nonlinear 
ARMA model and it is shown that the methods obtained for 
linear ARMA modeling follow as a special case of the non- 
linear results , 

Experimental verification of the methods developed for 
single channel linear ARMA modeling is presented and used 
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to explore the characteristics of the lattice solution 
techniques. The results clearly indicate that the lattice 
methods are extremely powerful, capable of producing highly 
accurate system models using short runs of data. 
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I. INTRODUCTION 



Traditionally, man has attempted to create models of 
portions of his environment for two principal reasons: 

1. To gain insight and understanding as to their 
functioning ; 

2, As a prelude to taking some action such as attempting 
to exercise control over them. 

The field of physics for instance, is replete with examples 
where men have created models to study and explain phenomenon 
from planetary motion to the motion and even origin of sub- 
atomic particles. In designing machines, engineers routinely 
rely on models of the components they use to describe how they 
will function, and to obtain the desired results in the final 
product. Economics is another field where the use of models 
abounds for such purposes as identifying, forecasting or 
trying to direct trends. 

The scope of the modeling problem is quite broad be- 
gining with a decision on the type of model to be used, what 
physical quantities to measure, how to estimate the para- 
meters of the model from the measurement, and finally a veri- 
fication of the model. In the chapters that follow, one 
facet of this problem, that of estimating model parameters, 
will be explored in detail for a number of linear and non- 
linear models. 
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A. THE PROBLEM STATEMENT 



The primary concern of this work is the determination of 
discrete time models for both linear and nonlinear, time 
invariant systems from sampled observations of the system 
inputs and outputs. The general approach underlying all of 
the models examined here assumes that the system to be 
modeled is described by the equation 

y (k) = F iQ di(k)] + F 20 [y(k-l)] + F 30 [u(k) ,y(k-l)] (1.1) 

where F^q, F^q and F^ are functions of past and present 
values of their arguments, and u(k) and y(k) are the system 
input and output respectively. This is depicted in Figure 
(1.1). A possible method for modeling this type of system 
is to create a model of exactaly the same configuration with 
functions F^q , F 2 q and F^qj^ assume a form for these functions, 
operate the system and model in parallel with the same input 
and adjust the parameters of the model functions to minimize 

/s 

the mean square error (MSE) between the model output y(k) 
and the system output. The symbol " A " is used here to indi- 
cate that the signal is an estimate of y(k). This is depicted 
in Figure 1.2 and is often referred to as direct form modeling 
since the assumed topology of the system is directly copied 



Script notation will be used to refer to quantities 
associated with the system while nonscript notation will be 
used for their corresponding approximants in the model. 
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Figure 1.1. The assumed form for systems to be 

modeled. T represents a unit delay. 



u(k) 




e(k) 



Figure 1.2. A direct approach to system modeling. 



by the model. Here the model output is given by 



(k)=F 1 Q[u(k)]+F on [y(k-l)]+F, n [uCk),y(k-l)] (1.2) 



20 
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and the error signal is referred to as the "output error". 

As an example, if the system is linear an appropriate model 
is 

N 

F, n [u(k)] = ^ ^ a(i)u(k-i) (1.3a) 

1 i=0 

N 

F 2C) [y(k-l)] = / 4 Mi) y(k-i) (1.3b) 

i=l 

1 

F 3 g[u(k) ,y (k-1) ] = 0 (1.3c) 

(for linear models, F 3 q[*] will be zero). In general 
however, the direct form approach will have serious diffi- 
culties if either F 2 q[*] or Fgg[*] are nonzero since the 

/\ 

past values of y(k) used in these functions are themselves 
dependent on the model parameters, A mimimum mean square 
output error approach results in a system of highly non- 
linear simultaneous equations which must be solved to 
obtain the model parameters. 

To avoid these difficulties, the equation error approach 
[Refs. 34 and 23] to system modeling (which uses different 
model forms in the analysis and synthesis phases) will be 
applied to the problem. The analysis model is depicted in 
Figure 1.3 and differs from the direct form model in that 
F 2 g and Fgg are functions of past and present values of the 
delayed system output rather than the analysis model output. 
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Figure 1.3. The equation error approach for 
system modeling. 

For each of the models studied, a general form for the three 
functions is assumed and the parameters of the model (coef- 
ficients of the functions) are set to obtain a MMSE solution. 
In each case, the MSE cost function is a quadratic function 
of the model parameters (due to both the equation error 
formulation and to the form chosen for the functions) with 
a unique minimum and therefore the solution is given by a 
system of linear equations. The synthesis model is of the 
same form assumed for the system in Figure 1.1 and uses the 
functions F-^q , F£ 0 and Egg determined during the analysis 
phase. , 
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As an alternative to the topology shown in Figure 1.3 
it will occasionally be convenient to consider the error 
signal e(k) as the output of the analysis model rather than 

/■s 

the prediction y(k) , This can be accomplished in one of two 
ways by defining 

F 2 Cy (k) ] = y ( k) - F 2 g[y(k-1)] (1.4a) 

or 

F-jCuCk) ,y(k) ] = y (k)-F 2 g[u(k) ,y (k-1) ] (1.4b) 

and reformulating the analysis model as shown in Figures 

1.4a or 1.4b. These model forms are often referred to as 

prediction error models since their outputs are the errors 

in predicting y(k) rather than the predictions themselves. 

There are, however, no substantive differences between the 

modeling approaches depicted in Figure 1.3, 1.4a and 1.4b. 

The equation error formulation can be generalized to 

multiple input multiple output systems as well (henceforth 

referred to as multichannel systems) by considering u(k) and 

y(k) as vectors of input signals and output signals 

and F 1q , F 2 q, F 3Q , F 2 and ^3 as vector functions. The 

prediction error signal e(k) becomes a Q^-vector of signals 

and the model parameters can be set to minimize the trace 

T 

of the prediction error covariance matrix P = s{e(k)e(k) }. 
Such generalizations have been developed to a degree in the 
multichannel filtering literature and will be investigated 
further here . 
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Figure 1,4. Prediction error model forms. 
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It is important to keep in mind however, that while the 
equation error formulation can be used to find a model solu- 
tion, it is an indirect method as opposed to the direct form 
method which minimizes the mean square value of output error. 
The direct form model has been modified to obtain the equa- 
tion error analysis model so that the parameters can be ob- 
tained via the solution of systems of linear equations. The 
price paid for this simplification in the model analysis 
problem is that additive noise on the measured system out- 
put will introduce a bias in the model coefficient estimates. 

B. OVERVIEW 

Chapter II along with appendices A through F provide a 
unified review of the existing background theory on minimum 
mean square equation error modeling of linear systems. The 
moving average (HA) and autoregressive (AR) models are pre- 
sented and their relative merits compared. In Section II. C. 
the Levinson algorithm [Refs. 9, 10 and 27] for the AR and 
MA models is developed, greatly simplifying the solution 
process for these models. Section II. D. then shows that the 
Levinson algorithm defines the AR and MA models in terms of 
lattice filter structures. 

These lattice structures have received widespread atten- 
tion and have led to a host of new developments in modeling, 
spectral estimation, filter structures and adaptive filtering. 
Examination of the properties of these forms have suggested 
a number of new methods for calculating model coefficients 



that offer increased accuracy, and in some cases guarantee 
model stability even in the presence of correlation estimates 
obtained by averaging over short time intervals [Refs. 5, 

20, 29 and 36]. Applied by Burg [Ref. 5] to spectral esti- 
mation, these methods allow the determination of power 
spectra via AR modeling from very short runs of data without 
any need for the use of a window function. In finite pre- 
cision arithemetic implementations , the lattice structures 
have been shown by Markel and Gray [Ref. 33] to be less sensi- 
tive to roundoff noise and coefficient quantization than 
direct structures and have led to the design of other 
structures that offer improved performance over conventional 
parallel realizations. Griffiths has shown that these 
lattices can be implemented adaptively [Refs. 16, 17 and 18] 
and that they offer the potential for more rapid convergence 
than conventional LMS adaptive filters. Recently Morf [Refs. 
36, 37 and 38] has also used these lattice structures to 
implement a recursively updated deterministic least squares 
adaptive scheme. It is readily apparent therefore, that the 
original work of Levinson and the lattice structures that 
have evolved from it have had an .important impact on the 
field of digital signal processing, 

In Section II. E., the multichannel generalization of many 
of the single channel AR and HA modeling results is presen- 
ted. After a discussion of the basic multichannel AR and MA 
models [Refs. 26 and 45], the multichannel version of the 
Levinson algorithm originally developed by Whittle [Ref. 56], 
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and Wiggins and Robinson [Ref, 61] is presented. A new 
form for the models is introduced and used here however, 
to facilitate the application of these results later in 
various other modeling problems. Multichannel lattice 
structures are then derived and from them alternative 
solution methods for the modeling problems are developed. 

Finally, in Section II. F. the LMS adaptive algorithm 
[Ref. 58] is reviewed and the adaptive implementations of 
the lattice structures due to Griffiths are presented. 

In Chapter III, the more general autoregressive moving 
average model is presented using the equation error formu- 
lation attributed to Kalman [Ref. 23]. After a brief 
discussion of the model, new model transition formulas 
are developed showing how the ARMA model is related to the 
simpler and less general AR and MA models. System input 
signal requirements for the ARMA modeling process are ex- 
plored and it is shown that the power spectrum of the input 
signal can be considered as a frequency dependent weighting 
function in the model optimization. Then the main result 
of the chapter is presented. With suitable assumptions, 
a recursive in order solution method for ARMA modeling 
(the (n+l)=st order solution is obtained from the n-th 
order solution) is obtained based on the Levinson algorithm for 
multichannel AR models . From this , lattice solution methods 
for the ARMA model are developed in both batch processing 
and adaptive form. (Batch processing here refers to assuming 
ergodicity and estimating correlations with time averaging) . 



A similar result has recently been presented by Morf [Refs. 

37 and 38] with the assumption of a white noise input sig- 
nal to the system. The results presented here follow from a 
different approach without the assumption of a white noise 
input. Experimental results are also presented verifying 
the methods and theory, and showing their advantages (and 
disadvantages) over conventional ARMA modeling methods. 

The programs used in these simulations are listed in Appen- 
dix J. In Section III.F. , and Appendix G, it is shown that 
these single channel methods readily extend to the multi- 
channel ARMA model, and as one would expect, can be obtained 
as a special case . 

In Chapter IV two types of nonlinear models, the Volterra 
series model and the new nonlinear ARMA model recently pro- 
posed by Parker [Ref. 64], are considered. After a brief 
discussion of the Volterra model, it is shown that the 
solution can be obtained using multichannel MA lattice methods 
if the regular form of the Volterra kernels is used in place 
of the conventional symmetric form. Then the nonlinear ARMA 
model is presented in Section IV. B. and it is shown that for 
many systems, this model can remedy the problem of the large 
number of terms (ideally infinite) required by the Volterra 
model to represent the system in much the same way that the 
ARMA model solved the problem arising in the MA model. In 
Section IV. B. 2 it is also shown that by using the regular 
form, the solution for the nonlinear ARMA model can be 
obtained using multichannel AR lattice methods and that the 
linear ARMA model solutions developed in Chapter III follow 
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as a special case. Appendix I then presents two examples of 
nonlinear ARMA modeling. First a somewhat academic example 
of a cascade of linear and nonlinear subsystems is given 
then a nonlinear ARMA model is proposed for the tracking 
behavior of a phase locked loop. 

Finally, in Chapter V, two applications for the linear 
and nonlinear ARMA modeling methods developed in Chapters 
III and IV are discussed briefly. (They are reduced order 
modeling of complex systems and modeling for fault detection 
and diagnosis.) Then in Section V.B. conclusions are drawn 
on the results of this work and a list of significant open 
questions (both old unanswered questions and new ones 
raised here) is compiled. 
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II. DISCRETE TIME LINEAR SYSTEM MODELING; BACKGROUND THEORY 



While few physical systems are absolutely linear, linear 
models often suffice to accurately describe their behavior 
under normal operating conditions, A rich body of theory 
has therefore been developed for the analysis and modeling 
of linear systems [Ref. 22] and a thorough knowledge of 
this theory is vitally important to anyone interested in 
understanding the functioning of these systems. The con- 
tinuing expansion in the availability of powerful, inexpen- 
sive digital computing capabilities has also made discrete 
time techniques take on a special prominence. With this 
as motivation, the portion of the background theory in 
discrete time linear modeling upon which much of the re- 
mainder of this work depends , is developed here from the 
unifying standpoint of a minimum mean square equation error 
model solution. 

The moving average and the autoregressive models are 
developed first for single input single output systems. 

Their solution via the Levinson algorithm is presented and 
from this algorithm alternate solution methods based on 
lattice filter structures are derived. It is shown that 
almost all of these results can be generalized to the 
multiple input multiple output case and the corresponding 
multichannel modeling methods are developed. Finally, 
adaptive implementation of the modeling methods for both 
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the conventional filter structures and the lattice filter 
structures is presented as an alternative means of obtaining 
model solutions. 

A. MOVING AVERAGE MODELS 

The moving average (MA) model was among the earliest 
discrete models developed. [Refs. 4, 11 and 19] It estimates 
the current value of the output of a system as a weighted 
combination of the present value and N past values of the 
system input where N is the order of the model. The problem 
then is to estimate the weighting function or impulse re- 
sponse of the MA model in some fashion. Since the MA model 
characterizes a system in terms of a finite duration approx- 
imation of its impulse response and since any linear time 
invariant, single input single output system is completely 
specified by its impulse response, the MA model is quite 
general and can be used for a wide class of systems. De- 
fining (N+l ) -vectors of model weights and input data as 

T 

a + = [a(0) ••• a(N)] ^ ^ (2.1a) 



A superscript "+" is used to indicate that in spite of 
the fact that these vectors are used for a N-th order model, 
they are (N+l ) -vectors with elements . indexed from zero to N 
rather than from one to N. Superscript T demotes transpose. 

2 

Superscripts in parenthesis will later be added to the 
model coefficient vectors to explicitly indicate their de- 
pendence on the order of the problem being solved. They are 
omitted for simplicity however whenever doing so does not 
result in ambiguous notation. 
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u + (K) = [u(k) ••• u(k-N)] 



(2.1b) 



the MA estimate of the system output becomes 



y(k) 



^ a(n)u(k-n) = u + (k) a + 
n= 0 



In terms of the modeling approach of Figure 1.3, F^g and 
Fgg are assumed to be zero. F^ is a linear time invariant 
function of past and present values of u(k) . Assuming sta- 
tionarity, an expression for the mean square value of the 
error as a quadratic function of the weights (a(n)} is 
given by 



= a + R + + a + 
u u 



-2a 



HI + 

u y 



R (0) 
uu 



Af 



( 2 . 2 ) 



where 



— vw 



in general R (n) = e (v(k)w(k+n) } , r = e (v(k)w(k+n) } , 
vw — vw — ’ 

T 

e{v(k)w(k) } and e{ } denotes expectation. 



R (0) • • • R ( -N ) 

uu uu 



* + + 
u u 



R (N) • • • R (0) 
uu uu 



r 



. + 

u y 



[R uy <0) R uy <H,:|T 



The surface described by equation 2.2 can be pictured as a 
concave hyperparaboloid with a unique minimum and the 
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characteristics of such a surface are described in Appendix 
F. For example when N=l, the MSE as a function of a(0) and 
a(l) appears as shown in Figure 2.1. 



t 



E 



2 




a(0 ) — ► 




Figure 2.1. MSE as a function of model weights 
for a first order (N=l) MA model. 



The minimum mean square error solution for the coeffi- 
cients is given by 



- + + - OPT 
u u 



— + 

u y 



(2.3) 
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and the corresponding minimum value of the cost function 



E 



2 



is 



E 



2 . 
mm 





(2.4) 



u y 



Equation (2.3) is a discrete time matrix form of the Wiener 
Hopf equation 



where u(t) and y(x) are the continuous input and output 
signals and h(x) is the system impulse response. The pro- 



equivalent of deconvolving the input autocorrelation func- 
tion from the cross correlation of input and output to 
obtain the system impulse response in equation (2.5). 
Consequently the MA modeling process has been called dis- 
crete Wiener filtering or stochastic deconvolution. 

This model constitutes a direct form approach as defined 
in section 1.1 but does not encounter difficulty in obtaining 
the model weights since both F^g and F^g are assumed to be 
zero. As such, it possesses the advantage that the estimates 
of the model parameters will not be biased by the presence of 
additive noise on the output of the system as shown in 
Figure 2.2, as long as the noise is uncorrelated with the 
input signal. This can readily be seen by replacing y in 
equation 2.3 by y + v. Additive noise on the input signal 



00 




(2.5) 



cess of finding a^p^ in equation (2.3) is the discrete time 
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v(k) 




e(k) 



Figure 2.2. Moving Average Modeling 

however will adversely affect the modeling process, and 
introduce a bias in the solution for the model coefficients. 

In the transform domain-, the model can be represented by 
a polynomial in powers of z ^ and has therefore been referred 
to as an all zero model 

N 

A(z) = ]T) a(n) z n (2.6) 

n = 0 

In terms of this transfer function relationship, any bias 
introduced in one or more of the model coefficients has the 
effect of shifting the zero locations of the model. 

In summary a discussion of the advantages and disad- 
vantages of MA modeling is instructive. 

Advantages : 

1) The solution for the model parameters involves 
only linear equations . 
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2 ) 



The solution is unbiased in the presence of 
additive noise on the system output as long as 
the noise and system input are uncorrelated. 

3) Since the model is nonrecursive it is always 
stable , 

Disadvantages : 

1) The number of terms (N+l) needed for sufficient 
model accuracy may be quite large. 

2) The solution of a large system of equations is 
required . 

3) The required correlation terms are usually not 
known and must be estimated by assuming ergodi- 
city and averaging in time. This requires the 
data to be windowed and set to zero outside the 
averaging interval in order to maintain the even 
symmetry of the autocorrelation functions. 

4) The modeling process is restricted to linear 
time invariant systems. 

B. AUTOREGRESSIVE MODELS 

The autoregressive (AR) model attempts to deal with one 
of the difficulties (1) encountered in MA modeling; the need 
for a large number of coefficients to accurately describe 
the model. [Refs. 2, 4, 11, 19 and 23] In AR modeling, 
which is sometimes referred to as linear prediction, a pre- 
diction error approach is considered where 

N 

e(k) = y(k) - £ b(n) y(k-n) (2.7a) 

n=l 
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This can be written as 



e(k) = y(k) - y(k) 



= y(k) - y(k) T b 

with 



y(k) = Cy(k-l) 



y(k-N) ] T 



and 

b = [b( 1) • • • b(N) ] 1 



(2.7b) 



(2.7c) 



( 2 . 7d) 



Here F 1Q and F^g are assumed to be zero (this assumption 
will be modified later to allow a dependence on the input 
signal in the synthesis phase) and F^g provides an estimate 
of the current value of the system output as a weighted sum 
of N past outputs. The mean square value of prediction 
error as a quadratic function of the weights {b(n)} is 
given by 



E 0 = b T R b - 

2 yy 



2 b T r 

— yy 



R (0) 

yy 



(2.8a) 



and the corresponding MMSE solution for the weights is 
given by 



R b^T,™ = r 
-yy-OPT -yy 



(2.8b) 
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with 



E 



'mm 



R (0) 

yy 



^OPT r yy 



(2.3c) 



Using equation 2.7a, an expression can be written 
in the transform domain for the prediction error model which 
accepts y(k) as its input and produces the error sequence 
e(k) as its output. 

N 

= 1 - E h(n) z -n = 3(z) (2.9) 

n=l 

If it is assumed that the system input output relationship 
can be represented in transfer function form with 



Y(z) 

uTzT 



H( z) 



q(Q) 

1- ^ b( n) z’ n 
n=l 



a(0) 

BTzT 



( 2 . 10 ) 



and that the model parameters can be determined so that B(z) 
= B(z), then the prediction error output will be exactly 
e(k) = a(0) u(k). For this reason AR prediction error 
modeling has often been called inverse filtering since the 
prediction error filter essentially reverses the actions of 
the system (with the exception of a gain) . Since the 
analysis model is in this inverse form rather than in the 
direct form, the presence of additive noise on the measure- 
ment of the system output as shown in Figure 2.3 will intro- 
duce a bias in the solution for the model parameters. This 
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v(k) 




Figure 2.3. Autoregressive prediction error 
modeling as an inverse filtering 
process . 

has the effect of shifting the roots of 3(z) which are es- 
timates of the poles of the system and is the price paid 
for the ability to obtain the model solution from a set of 
linear equations. 

Thus far, only the analysis portion of the AR modeling 
process has been discussed. With the inverse filtering 
interpretation of the prediction error analysis model, a 
reasonable synthesis model is given in transfer function 
form as 



H(z) 



a ( 0 ) 
B( z) 



( 2 . 11 ) 



with the gain term set so that the mean square value of 
a(0) u(k) is the same as that of the prediction error 
signal. Thus it follows that 



a( 0 ) 2 



c (e(k) 2 } 

Hr ToT" 



uu 



( 2 . 12 ) 
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Since the synthesis model is in the form of an all pole 
filter, an appropriate impulse response with infinite dura- 
tion might be obtained using a low order model (small N) , 
a result that is impossible to obtain in any finite order MA 
model. This is not to say however that a low or even finite 
order AR model will always be an appropriate model for any 
linear system. If the transfer function representation for 
a system contains both zeros and poles, no finite order AR 
or MA model can serve to exactly represent it. This fact 
can be understood by considering the form of a geometric 
series 

CO 

~ZT = E (Cz _1 ) n for | C z — 1 1 <1 (2.13) 

1-Cz x n= 0 

which shows that a single pole can be represented by an 
infinite number of zeros and visa versa. Thus if the sys- 
tem has a single zero, a high order AR model may be required 
to represent it with sufficient accuracy. 

In summary, the advantages and disadvantages of AR 
modeling may be listed as follows: 

Advantages : 

1) The solution for the model parameters involves 
only linear equations . 

2) Sometimes an appropriate infinite impulse re- 
sponse can be obtained with a small number of 
parameters in the model. 

3) Direct knowledge or measurement of the system 
input is not required for determing the system 
poles . Only a knowledge of its mean square 



31 



value is necessary for determining the gain 
factor . 

Disadvantages : 

1) The model is biased by the presence of additive 
noise on the measured system output signal. 

2) The number of terms required for sufficient 
model accuracy may be quite large if zeros are 
present in the system. If this occurs, the 
inversion of a large matrix will be required. 

3) The required correlation terms are usually not 
known and must be estimated by assuming ergodicity 
and using time averages, 

4) The modeling process is restricted to linear 
time invariant systems. 

This list of advantages and disadvantages is quite 
similar to the one compiled for HA models with two notable 
differences; the bias in the model and the absence of a 
requirement for input measurements. This second point is 
significant in that it has led to the application of AR 
modeling to many problems where an input signal is unmea- 
surable or indeed does not exist including speech modeling 
and spectral estimation. [Refs, 2, 5, 12, 15, 21, 32 and 44] 
The noise problem has restricted the process to applications 
where measurements with sufficiently high signal to noise 
ratio are available, making the effects of the bias minimal. 
[Refs. 24 and 43] 
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c. RECURSIVE IN ORDER SOLUTIONS FOR AR AND MA MODELS 

The preceeding disucssions of the AR and MA modeling 
problems tacitly assummed an a priori knowledge of the correct 
model order. If this knowledge is not available a reasonable 
approach for determining the correct model order must be 
developed [Ref. S3], A commonly employed strategy is to 
successively increment the model order while observing the 
MSE until further increases fail to substantially reduce the 
MSE. This requires solving for a number of different models 
and can be an arduous task if equations (2.8b) or (2.3) are 
employed directly. 

The autocorrelation matrices appearing in the AR and 
MA model equations (2.8b) and (2.3) are highly structured 
matrices (both Toeplitz and symmetric) and this fact can be 
exploited to facilitate the solution of these equations. 

The Levinson algorithm [Refs. 9, 10 and 27] makes use of 
this structure to obtain model solutions recursively in 
order, that is, the solution for the n-th order model is 
assumed to be known and the solution for the (n+l)-st order 
model is then obtained from it. In this manner it is pos- 
sible to start with a first order AR or a zeroth order MA 
solution given by a single equation and build up the 
desired order solution. The AR model will be treated first 
since it is a special case that simplifies the analysis. 

The simplifications arise due to the fact that the r vector 

-yy 

on the right hand side of equation (2.8b) is made up pri- 
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Superscripts in parenthesis are used to explicitly indicate 
the order of the problem when specifically needed. 

1 . The Levinson Algorithm For AR Modeling 

The n-th order AR model solution of equation 2.3b 
is given by 



~ (n) , (n) (n) 

-yy * - £yy 



(2.14) 



The Levinson algorithm assumes a relationship between the 
n-th and (n+l)-st order solutions given by 



, (n+1) 


K (n) 

b 




£ (n) 


b = 


0 


+ 


pLl) 



(2.15) 



and solves for the vector and the coefficient k^ n+ ’*"^. 

Define permuted versions of the vectors b^ and r as 

-yy 

f n and £yy. n ^ by reversing the order of their elements. 



f (n) 


f \ ( n ) 

b(n) 


P <n) = 

-yy 


R (n) 

yy 

• 




c 

1 1 

,C> 

1 




• 

R (1) 

yy 



(2.16) 



Because of the Toeplitz symmetric structure of the auto- 
correlation matrix, equation (2.14) can also be written as 



R (n) 

-yy 



(n) 



P (n) 

-yy 



(2.17) 
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and this relationship is essential in the development of the 
Levinson algorithm. (To apply the algorithm therefore when 
time averaged estimates of the needed correlations are used, 
the data must be windowed prior to averaging to maintain the 
even symmetry in the autocorrelation function estimates and 
produce the required structure in the autocorrelation matrix.) 
Making use of equation (2.15). in the (n+l)-st order version 
of equation (2.14), and using the relationship of (2.17) to 
solve for e and k results m 



(n) ^(n). (n+1) 

e = -f k 



0 

(2.18a) 



and 



, (n+1) 
k 



R (n+1) 
_ZZ 



R (0) - 



yy 



( n) T , ( n) 
- £ b 

. (n) T ^ (n) 
b r 



( 2 . 18b) 



Therefore, in using equations 2,15 and 2.18 to obtain b^ n+ ~^ 
from b^ n ^ via the Levinson algorithm only one new piece of 
information, k^ n+ ^, need be calculated. The denominator 
of equation (2.18b) can also be recognized from equation 
(2,8c) as the MMSE for the n-th order AR model E 2 ^ n \ and 
thus there is little concern over the possibility of it being 
zero. If the n-th order solution produces a perfect predic- 
tion (zero MSE) there is no point in trying to find a better 
prediction by increasing the order to n+1. The evaluation of 
equation (2.18b) can be further simplified by observing that 
the MSE also follows a recursion from one order to the next 
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given by 



P (n+1) 
L 2 




k (ntl > 2 ] 



( 2 . 19 ) 



making it unnecessary to evaluate the denominator at each 
value of n. (Details of this derivation are omitted here 
but included in Appendix A in the derivation of the more 
general multichannel Levinson algorithm.) This relation for 
the propagation of mean square prediction error also leads 
to the restriction that k^ n+ '^ must be bounded in magnitude 
by unity. 

2 . The Levinson Algorithm For MA Modeling 

Next consider the n-th order MA model given by 



(n) 

. + + 
u u 





u y 



( 2 . 20 ) 



and again, assume a relationship between the (n+l)-st order 
and n-th order solutions given by 



+ (n+1) 


+ (n) 
a 




1 

1 1 

+ 

'w' 

H 

1 


a = 




+ 


— 




0 




(n+1) 






CT 

o 



Notice that in this n-th order problem, R is actually a 

u u 

n+1 by n+1 matrix and could be written as R ^ n+ D. Define 
J — uu 
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( 2 . 22 ) 



r <n+l) 
— uu 



R (1) 
uu 


(n+1) 
’ — uu 


R (n+1) 
uu 
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R (n+1) 
uu 
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Using equations (2,21) and (2,22) in the n+1 order MA model 
equation it follows that 



(n+1) .(n+1) (n+1) 

Y = - f g 



(2.23a) 



where f^ n+ ^ £ s defined in a manner similar to (2.16) and is 
comprised of the coefficients that arise in an (n+l)-st 
order autoregression on the input signal u(k) , 

Therefore to obtain a moving average model relating the 
system output signal y(k) to the input signal u(k) , an 
autoregressive model for the system input must first be 
solved. Furthermore, 



g 



(n+1) 



/ , , J , ( n) 

d / , , , (n+1) + 

R (n+l)-p a 

uy — uu — 

R ( 0 ) -p (n+1) f (n+1) 
uu --uu — 



(2.23b) 



and the denominator of equation (2.23b) is the MMSE in the 
(n+l)-st order autoregression on the signal u(k) . 

It is significant to note that in applying the 
Levinson algorithm to find a given order AR or MA model, all 
lower order models along with their MSE ' s are obtained. 

( n ) 

Also, intermediate quantities emerge (the{k } in the AR 
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model and the {k^ n ^} and {g^ n ^} in the MA model) which fully 



characterize the models and could be used as an alternative 
to the {a(n)} or (b(n)} coefficients, This point will be 
developed further in subsequent sections. 

D. LATTICE FORM AR AND MA MODELS 

The Levinson algorithm derived in the previous section 
can be used to derive lattice structures to implement the 
MA model and the AR analysis and synthesis models as alter- 
natives to a tapped delay line type of implementation using 
the coefficients a(n) or b(n) directly. [Refs. 29, 30, 32 
and 33] 

1 . The AR Modeling Lattice Structures 



order solutions to the AR modeling problem determined in 
equations (2.15) and (2.18a) it follows that the transfer 
function of the prediction error model can be written re- 
cursively in order as 



From the relationship between the (n+l)-st and n-th 




2 



O 



Defining a new transfer function 





equation (2.24) can be written as 




(2.25b) 
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(z) recur- 



and an expression can also be written for B*' n+ '^ 
sively in order by rewriting equation (2.25a) for order 
(n+1) and substituting equation (2.24) yielding. 



B (n+1, (z) = f¥ n) (z) - k (n+1) B <n) (: 



(2.25c) 



As discussed earlier in connection with equation (2.9), 
B^ n ^(z) describes the n-th order prediction error model and 
when its input is the system output Y(z), it produces the 
n-th order prediction error signal. 



E (n) (z) = B (n) (z) Y ( z ) 



(2.26) 



In the time domain this signal can be interpreted as the 
error in predicting yCk) forward in time from a 
weighted combination of the n past values {y (k-1) * • 'y(k-n) } . 
To understand the significance of B^ J (z) consider the out- 
put signal when this model is excited by Y(z). 

E (n) (z) = B (n) (z)Y(z) 

= z~ n [l - £ b (n) (i)z +i ]Y(z) (2.27) 

L=1 

In the time domain, e v ; (k) can be interpreted as the error 
in predicting y(k-n) backward in time from a weighted 
combination of the future signals (y (k-n+1) • • ’ y (k) } . These 
n-th order forward and backward prediction processes at time 
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k Q are illustrated in Figure 2.4. Henceforth, an overbar 
will always be used to denote quantaties associated with 
backward in time predictions. 




Figure 2.4. Forward and Backward Prediction Error 
Filtering . 

From equations (2.25b) and (2.25c) equations can be written 
recursively in order for these forward and backward predic- 
tion error sequences as: 

e (n+1) (k) - e (n) (k) - k <n+1) i (n) (k-l> (2.28, 



e (n+1) (k) 



S (n) Ck-l) - 



k (ntl) e (n) (k) 



(2.28b) 



Noting that the prediction error for a zero-order AR pre- 
dictor of y(k) (or no predictor at all) is just the signal 
y(k) itself, 

e (o) (k) = e (o) (k) = y(k) (2.28c) 

the prediction error filter can be drawn in lattice form as 
shown in Figure 2.5 for a second order model. 




Figure 2.5. Lattice Form Of A Second Order Pre- 
diction Error Model. 

This structure has many interesting properties , 
among the most important of which is the successive de- 
coupling property. In going from one order AR model to the 
next, all of the previously determined transfer function 
coefficients (b(n)} will generally change. The Levinson 
algorithm shows however that only one new piece of infor- 
mation is needed to obtain the optimum (n+l)-st order 
solution from the optimum n-th order solution (see equation 
(2.24)), In terms of the lattice filter of Figure 2.4 this 



means that given the optimum n-th order model in lattice 
form, one need only add another stage to the structure, 
setting the coefficient of that stage k^ n+ '*'' ) to minimize 
the mean square value of e^ n+ ^(k). Nothing in the first 
n stages need be changed. The overall high order minimi- 
zation problem is in this fashion decomposed into a se- 
quence of first order minimizations, one at each lattice 
stage . 



Another important property of the lattice which can 
be proven and will be of use later is the orthonogalization 
of the backward prediction error sequence [Ref. 32] which 
states that 



e (e^ ^ (k)e^ ^ (k) } 



0 



i*j 




i=j 



(2.29) 



Thus it is seen that a set of orthogonal signals (the back- 
ward prediction errors at the various stages) are generated 
as a by-product of the lattice model. 

As a consequence of the successive decoupling pro- 
perty of the lattice, a number of alternatives to equation 
(2,18b) for determining the lattice coefficients can be 
found. The most obvious method is to set k^ n+ ^^ to 
explicitly minimize the mean square value of forward pre- 
diction error in equation (2,28a) at the (n+l)-st order 
stage given the best lattice of order n. This is termed 
the forward method and is denoted by a subscript F on the 
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lattice coefficients. The resulting solution is given by 



(n+1) 



e{e (n) (k)g (n) (k-l)} 

e{S (n) (k-l> 2 ) 



(2.30) 



Alternately, the mean square value of the backward predic- 
tion error signal in equation (2.28b) could be minimized to 
determine the coefficient resulting in the backward method 
solution given by 



(n+1) 



'B 



g{e (n) (k)i' (n) (k-l)} 

e{e (n) (k) 2 } 



(2.31) 



Since, however, the forward and backward prediction error 
transfer functions are given by B^ n ^(z) and z -n B^ n ^ (z - ^) , 
it follows that 



|B (n) (z)| = | B (n) ( z) | 



(2.32) 



and since they are both driven by the same input, Y(z), the 
mean square values of both the forward and backward predic- 
tion error signals at a given stage are the same making 
equations (2.30) and (2.31) equivalent. It is also possible 
to show that they are equivalent to equation (2.18b). 
Recognizing that the required expectations will eventually 
have to be estimated by using time averages, these two 
methods for calculating k^ n+ ^ will not in general be 
exactly equivalent and it might be preferable to use the 
arithmetic mean of the mean square values of forward and 



i . o 



backward prediction error as a cost function 



iCe {e 



( n+1 ) 



(k) 



— (n+1) ,, v2 n 
e (k) }] 



(2.33a) 



This leads to a third method derived by Burg [Ref. 5] in 
his work on maximum entropy spectral analysis and given by 



k 



( n+1 ) 
BG 



2e{e (n) (k)e (n) (k-l) } 

s{e (n) (k) 2 }+£{e (n) (k-l) 2 } 



(2.33b) 



Notice that is the harmonic mean of kp n+ ^ and kg n+ ^ . 
A fourth method due to Itakaura and Saito [Ref. 20] can also 
be derived which results in 



k (n+1) = £ {e (n) (k)e (n) (k-l)}- (2.34) 

IS yl r (n) 7T X r-(n),. n 2 1 ' 

V e (e (k) }s {e (k-1) } 

and is simply the geometric mean of the forward and 

backward coefficients . 

Since equation (2.34) is of the form of a normalized 
correlation k^g will always be bounded by unity in magnitude 
as required by equation (2.19). Furthermore since 



Harmonic Mean | _< | Geometric Mean 

it follows that will be similarly bounded. These 

bounds are significant since Markel [Ref. 32] has shown that 
| k ^ n ^ | <1 is a necessary and sufficient condition to ensure 



that the roots of B^ n \z) be within the unit circle guaran- 
teeing the stability of the n-th order all pole model. No 
such guarantees of model stability exist when the forward 
or backward solution methods of equations (2.30) and (2.31) 
are used with the correlation estimates obtained by aver- 
aging for finite time intervals. 

To determine the AR synthesis model in lattice form 
it is only necessary to rewrite equation (2.28a) as 

e (k) = e (k) + k e (k-1) (2.35) 

Together with equations (2.28b) and (2.28c), this describes 
the structure shown in Figure 2.6 for a second order case 
and when it is driven by the second order prediction error 
signal, it will reconstruct y(k) exactly. Thus it imple- 
ments the transfer function — pi or, in general, when 

± B w; (z) 

stages are used — pp . Recognizing that if the pre- . 

B UO (z) 

diction error model is an accurate model of the system 
denominator polynomial, e^^ (k) =a(o)u(k) , this input signal 
is used in the synthesis model. Because of analogies with 
transmission lines and wave propagation models, the lattice 
coefficients have been referred to as reflection coefficients. 
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Figure 2.6. Lattice Form Of The All Pole Synthesis 
Model For The Second Order Case. 



2 . The MA Modeling Lattice Structure 

A similar lattice form is applicable to the MA 
modeling problem. From equations (2.21) and (2.23a) the 
transfer function of the MA model can be written recursively 
in order as 



A (n+1) (z) = 



A <n) (z) 



(nn) g nn (z) 



(2.36) 



where, as discussed in connection with equation (2,23a), 
B n+ ~^(z) is the backward prediction error transfer function 
for an autoregressive model of the input signal u(k) . 
Multiplying both sides of equation (2,36) by U(z) and trans- 
forming into the time domain it follows that 



A (n+1) ,,, 
y (k) = 



*(n) 

y (k) 



( n+ 1 ) — n+ 1 . , , 
; e (k) 



(2.37) 
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where e 11 ^(k) is the backward prediction error signal from 
the autoregression on the input signal u(k) , and can be 
obtained by operating a prediction error lattice with u(k) 
as its input. Then with the additional term in equation 
(2.37) the lattice form of the MA model can be drawn as 
shown in Figure 2.7. 




Figure 2.7. Lattice Form Of The MA Model (Second 
Order Case) . 

It was stated earlier that the AR prediction error 
lattice, as a by-product, forms a set of orthogonal or 
uncorrelated backward prediction error signals from its 
input. Here in the MA model, these orthogonal signals are 
linearly combined to form the MA estimate of the system 
output. If the input signal u(k) is a white process, an 
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examination of any of the solution methods previously dis- 

cussed will show that all of the {k ' } lattice coefficients 

will be zero since the delayed samples of a whil e process 

/ \ 

are already orthogonal. Otherwise the (k v '} lattice 
coefficients will be set to orthogonalize the backward pre- 
diction error signals. As a consequence of this, the 

( n ) 

weighting coefficients g can be set independently of 
each other; that is g*" 1 ^ can be set to minimize 



e{e^ n \k) 2 } = e{[y(k) - y^ n (!k)] 2 } 



(2.38) 



a ( n ) 

given the best prediction of order n-1, y ^(k). This results 

( n ) 

in an alternative expression to equation (2.23b) for g 
given by 



. (n) 



£{e^ U (k) e (n) (k)} _ s ( y(k) ; (n) (k)} 



(n) , , v 2 ■, 
£ {e (k) } 



/— (n) ,, ,2, 
£ {e (k) } 



(2.39) 



( n ) 

Here e^ ; (k) is the error between the system output and its 
n-th order MA estimate. 



E. MULTICHANNEL AR AND MA MODELING 

Both the AR and MA modeling problems previously dis- 
cussed, as well as their solution via the Levinson recursion 
and lattice filter methods, can be generalized to the 
multichannel case by replacing the various signals with 
signal vectors and replacing the weighting coefficients 
with appropriately dimensioned matrices of coefficients. 
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A discussion of this appears in Robinson [Ref. 45]. The 
equations that describe the AR and MA models and their MMSE 
solutions are repeated here for convenience. 



N 



AR 


y (k) 


= £ Mi) 
i=l 


R (0) 


R (-1) . . 


. . . R 


yy 


yy 


yy 


R (1) 


R (0) . . 


. . . R 


yy 


yy 


yy 


R (N-l) 


R (N-2 ) . 


. . . R 


yy 


yy 


yy 






N 


HA 


y(k) 


= £ a(i 

i = 0 


R (0) 


R (-1) , . 


. , . R 


uu 


uu 


uu 


R (1) 


R (0) . . 


. . . R 


uu 


UU 


UU 


R CM) 


R (N-l) . 


, . . R 


uu 


uu 


uu 


n a multichannel generalization 


q output : 


signals, u(k) 


becomes < 



(2.40a) 





Ml) 




R (1) 

yy 




M2) 


r 


R (2) 

yy 

• 




b(N) 




R (N) 

yy 


-i) 




( 




a( 0 ) 




R (0) 
uy 




a( 1) 


= 


R (1) 
uy 




a(N) 




R (N) 
uy 



(2.40b) 



(2.41a) 



(2,41b) 



Mi) becomes a square matrix of x coefficients and a(i) 



becomes a x Q. matrix of coefficients so that the N-th 



order multichannel model equations can be written as 



y(k) = 23 b(n) y(k-n) 
n=l 



N 



AR 



(2.42a) 



MA 



y(k) = 23 a(n) u(k-n) 
n=0 



(2.42b) 



The equations for the MMSE solutions (2.40b) and (2.41b) 
generalize directly as well by replacing each correlation 
coefficient R (n) by matrices of correlation coefficients 

T7T.7 J 



where v(k) and w(k) are signal vectors. This causes the 
overall correlation matrices to take on a block Toeplitz 
structure. The transfer function relationships of the AR 
prediction error model and the MA model take the form of 
matrix polynomials 



vw 



given by 



R (n) = e{v(k) w(k+n)^} 
— vw — — 



(2,43) 



B(z) = b(l) z" 1 + >•• + b(N) z”^ 



(2.44a) 



A(z) = a( 0 ) + a( 1 ) z” 1 + ••• + a(N) z' N 



(2.44b) 



so that 



E(z) = [I - B(z)] Y(z) 



(2.44c) 



( 2 . 44d ) 



Y(z) = A(z) U ( z ) 



where E(z) is the transform of the multichannel AR prediction 
error vector e_(k) = y_(k) - y^(k) . Alternately, equations 
(2.44a) and (2.44b) can be written as single matrices whose 
entries are polynomials in z rather than scalars. B(z) is 
of necessity a x Qq square matrix polynomial while the 
dimensions of A(z) (Q^ x Q^) depend upon the number of 
inputs and outputs which need not be the same. 

In the single channel AR problem, B(z) provides the 
transfer function of the prediction error, or inverse fil- 
ter, and must be inverted to obtain the all pole synthesis 
filter. The stability of the synthesis model therefore 
depends upon the roots of this polynomial. The matrix 
polynomial [I-B(z)] in the multichannel AR problem is, in 
like fashion, an inverse filter representation and must be 
inverted to obtain the synthesis model. This inversion of 
a matrix with polynomial entries is defined in the same 
manner as the inversion of a square matrix with scaler 
entries. To see what this inverse matrix polynomial looks 
like consider as an example a two channel autoregression 
with a prediction error filter given by 



Cl-B(z)] = 



1 - b i;L (z) 



-b 12 (z) 



' b 21 (z) 



1 - t> 22 ( z) 
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Applying Cramer's rule, the inverse matrix polynomial is 
written as 



Cl-B(z)] 



-1 



det [I-B Cz ) ] 



l-b 22 (z) b 21 (z) 



b^ 2 (z) l-b^^(z) 



and it is apparent that the stability of the multichannel 
synthesis model is dependent upon the locations of the zeros 
of the polynomial det [ I-B ( z) ] , 

This straightforward generalization of the AR and MA 
models is what has customarily been used in the literature 
to develop the multichannel models and similarly derived 
generalizations of the Levinson algorithm to solve them 
recursively in order are available as well. [Refs. 57 and 61] 
The multichannel AR and MA modeling problems and their 
solutions via the Levinson algorithm can however be recast 
as shown in Appendix A to make them compatible with the form 
of other linear and nonlinear modeling problems . To avoid 
confusion later in the application of the results, the 
derivations in Appendix A have been carried out in a generic 
form with x and d used to represent some of the signals and 
coefficients respectively. The symbols u, y, a and b have 
been reserved to denote system input, output and weighting 
coefficients . 

Equations (A. 7) and (A. 26) provide the MMSE solutions to 
the multichannel AR and MA modeling problems in forms 
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different than (although entirely equivalent to) those 
resulting from the straightforward generalizations of 
equations (2,40b) and (2.41b), The multichannel generali- 
zation of the Levinson algorithm derived in Appendix A 
can, with one exception, be seen as a matrix algebra 
generalization of the single channel algorithm and as, one 
would suspect, the single channel algorithm results as a 
special case of Appendix A. The one exception is that in 
the multichannel case, the n-th order forward and backward 
prediction error models are not simply related to one 
another. The single channel AR backward predictor is given 
by z n B^ n \z~'*') but in the multichannel case the backward 
prediction is not z _n [I_- ( z ~^~ ) ] . 

■Because of this, two reflection coefficient matrices 
K and K are required at each stage in the recursion to 
relate the n-th and (n+l)-st order solutions rather than 
just one as in the single channel problem. Also, in the 
single channel case, the fact that | B ^ n ^ ( z ) | = |B^ n ^(z)| 

and therefore e (e^ n ^ (k)^}=e (e^ n ^ (k)^} made possible the de- 
rivation of the Burg algorithm and the Itakaura-Saito al- 
gorithm, which ensured the magnitude of the reflection 
coefficient was bounded by unity and that equation (2.19) 
would result in nonnegative values of MSE, In the multi- 
channel algorithm however, the forward and backward predic- 
tion error covariance matrices and their traces are not the 
same (except for and P ^ ^ and as a result, straightfor- 

ward generalizations of the Burg and Itakaura-Saito algorithms 
are not possible. 
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Consequently with correlation estimates obtained by averaging 
over finite time intervals , there are no guarantees that 
equations (A. 18a) and (A- 19b) maintain the positive definite- 
ness of the prediction error covariance matrices. 

Multichannel generalizations of Burg's algorithm due to 
Muttal [Refs. 40 and 41], Morf [Ref. 39], and Strand 
[Ref. 50] which guarantee the positivity of the covariance 
matrices - are available but are not explored here because ■ 
of their complexity and because they would take the dis- 
cussion too far afield. 

The relations of equations (A. 20) and (A. 30) which are 
repeated here for convenience permit the construction of 
the multichannel AR analysis and synthesis lattice 
structures and the MA lattice structure. For the multi- 
channel AR model, 



e 



(n+1) 



(k) = e 



(n) 



(k) - K (n+1) e (n) (k-l) 



(2.45a) 




(2.45b) 




(2.45c) 



and the corresponding prediction error lattice is shown in 
Figure 2.8. 



x(k) 





(k) 



(k) 



Figure 2.8. Multichannel AR prediction error lattice 
structure for a second order model. All 
signal paths are vector paths and summa- 
tions are vector summations. The multi- 
plications indicate premultiplication of 
the signal vector by the specified coef- 
ficient matrix. 



To obatin the multichannel AR synthesis lattice, equation 
(2.45a) can simply be rewritten as 

e (n, (k) = e (n+1) (k) + K <n+1> V n) <k-l> (2.46) 



resulting in the structure shown in Figure 2.9. 




> Figure 2.9. Multichannel AR synthesis lattice 

structure for a second order model . 
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For the multichannel MA model, equations (2.45) and 



y 



(n+1) 



(k) 



= y 



;(n) 



(k)+G 



(n+1) 



? <n+1) <k) 



(2.47) 



describe the lattice structure shown in Figure 2.10. 




(k) 



Figure 2.10. Multichannel MA lattice structure 
for a second order model. 

As in the single channel case, the multichannel predic- 
tion error lattice exhibits the successive decoupling pro- 
perty and orthogonalizes the backward prediction errors at 
the various stages so that 

e{e (l) (k) e ( ^k) T } = 

\ 



0 



p(i) 






1 = 0 



(2.48) 
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As a consequence of the successive decoupling, the forward 
and backward reflection coefficient matrices at the (n+l)-st 
stage can be set to minimize the trace of the forward and 
backward prediction error covariance matrices respectively, 
given the best lattice of order n. This provides alterna- 
tives to equations (A. 17) for calculating K and K and also 
generalizes the forward and backward single channel solutions 
discussed previously, resulting in 



It is also possible to show that these relationships are 
entirely equivalent to equation (A. 17), In the multi- 
channel MA lattice, the orthogonality of the backward pre- 
diction error signals also allows the G matrices to be 
calculated in succession providing an alternative to equation 
(A. 28b) and generalizing the single channel solution given 
by equation (2.39). Setting G w to minimize the trace of 
the error covariance matrix 



K (n+1) _ p(n)" 1 ^(n) T 



(2.49a) 



j^(n+l) _ p(n)~‘ 1 ' ^(n) 



(2.49b) 



where 




(2.49c) 




(2.50a) 
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> 



where 



( n ) / , \ 1 \ ( n ) / , \ 

£g (k) = y_(k) - y. (k) 



(2.50b) 



results in a solution given by 




(2.50c) 



Another important characteristic of the lattice solutions 
to the AR and MA modeling problems given by equations (2.49) 
and (2.50) and their single channel counterparts is that they 
do not impose any requirements to window the data when finite 
time averages are used to estimate correlations. The auto- 
correlation function of a signal is inherently an even func- 
tion so that R (n) = R (-n). This fact is responsible for 
much of the special structure of the correlation matrices 
that appear in the model solution equations, and was also 
used in the derivation of the Levinson algorithm. In esti- 
mating the autocorrelation function via time averaging over 
a finite interval, a window function that is nonzero over 
only a given interval must be applied to the data to retain 
this even symmetry property in the estimate. If the data is 
not set to zero outside a given interval, end effects will 

A A 

destroy the symmetry so that R (n) t R (-n) as deDicted 

vv vv 

in Figure 2.11. 



V ( t ) 




v( t-n) 



v(t+n) 



Figure 2.11. Time averaging to estimate correlations 
without windowing. 



In the lattice solutions of equation (2.49) however, there 
is no requirement to make such an artificial assumption 
about the data (that it is zero outside some interval) . 

These properties of the lattice solution methods were 
responsible for their initial use by Burg in his work on 
maximum entropy spectral analysis [Ref. 5] and have con- 
tinued to generate interest in the application of lattice 
methods to other types of modeling problems. 
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F. ADAPTIVE MODELING 



The LMS adaptive algorithm provides a well known alter- 
native method for obtaining the solution to the AR or MA 
modeling problems which does not require the estimation of 
correlations or the inversion of a matrix [Refs. 58, 59 
and 60]. This algorithm updates an estimate of the model 
solution vector at each time instant by an amount propor- 
tional to the negative of the instantaneous gradient of the 
cost function; i.e., in a MA model, 

a + (k+1) = a + (k) - y V + (k) (2,51a) 



where y is a proportionality constant or adaptive gain. 
Since the actual gradient is usually not known, it is 
approximated by using the square of a single sample of the 
error as an estimate of the MSE so that 



P(k> = LeiW 



= -2 u (k) e(k) 



(2 . 51b) 



38 . 



a + =a + (k) 



and 



a + (k+l) = a + (k)+2y u + (k) e(k) 



(2 . 51c) 



In each of the models considered here , the cost function 
(MSE or trace P) is a quadratic function of the model weights 
and defines a concave hyperparaboled with a unique minimum. 
The functioning of the LMS algorithm under these conditions 
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can easily be understood by considering the scalar case of 
equation (2,51) illustrated in Figure 2,12. 

As this illustration shows , the algorithm can actually 
diverge for too large a value of adaptive gain. The rate 
of convergence is also dependent on the size of the adaptive 
gain. Widrow has shown that for stability, the gain must be 
set so that 

0 < y < (2.52a) 

max 

While, in the mean, the weight vector converges with an 
exponential time constant of 



x 



1 



2yA 



min 



(2.52b) 



where X . and X are the smallest and largest eigenvalues 
min max ° & 

of the input autocorrelation matrix R . From the stand- 

u u 

point of stability, y should be made relatively small but 
for rapid convergence, equation (2.52b) dictates that it 
should be large. Setting 



y 



a 



X 



max 



(2.53a) 



where a is a normalized gain and 0 < a <1, equation (2.52b) 
becomes 



x 



1 

2a 



X 

_max 

^min 



(2.53b) 



MSE 




a) Small adaptive gain; steady convergence toward the solution. 




b) Intermediate adaptive gain ; oscilatory convergence toward 




c) Large adaptive gain; divergence away from the solution 



Figure 2.12. Behavior of the LMS adaptive algorithm 
for various adaptive gains. 



and for a wide disparity between the largest and smallest 
eigenvalues (A . << A ), convergence will be quite slow. 

This consideration becomes increasingly important when high 
order model solutions are obtained adaptively since the 
dimensonality of the input autocorrelation matrix will be 
high and the possibility of a wide eigenvalue disparity 
greater . 

These same adaptive techniques have been applied by 
Griffiths [Refs. 16, 17, 18 and 31], to the AR and MA 
lattice filter structures derived in the last section. The 
key difference between the conventional adaptive filter and 
the adaptive lattice is that in the lattice, the adaptation 
is carried out on a stage by stage basis for each of the 
reflection coefficient matrices while in the more conven-' 
tional approach, the entire weight vector is adapted. It 
has already been established that the lattice structure 
makes the model solutions recursive in order. Implementing 
the lattice adaptivity makes the solution recursive in time 
as well since the estimate of the solution at each instant 
is dependent upon prior estimates of the solution. 

The conventional adaptive filter algorithm forms an 
error signal as the difference between some desired signal 
and its estimate; i.e. 



where y(k) is the desired signal, a is the weight vector 
and u + (k) is the input vector, and the time update for the 
weight vector is given by equation (2.51c). To derive the 
adaptive AR lattice consider equations (2.45) for a single 
stage. The lattice in general has vector error and desired 
signals and coefficient matrices as opposed to scalar error 
and desired signals and a coefficient vector in equation 
(2,54) but such a generalization is straightforward. Com- 
paring equation (2.45a) to (2.54) it is clear that: 

1) e < ‘ n+ '*'\k) is analogous to the error signal; 

2) ey ; (k) is analogous to the desired signal; 

3) e^ n ^(k-l) is analogous to the input signal vector. 

Using the trace of as a cost function and applying 

a LMS adaptive algorithm to determine the forward reflection 
coefficient matrix it follows that 



K (n+1) (k+l) = 



„(n+l),, . , 0 (n+D— (n),. , . (n+1),. .T 

K (k) + 2p e (k-l)e (k) 



(2 . 55a) 



With these analogies, equations (2.51c) and (2.55a) are seen 
to be virtually identical with the exception that (2,51c) 
uses a scalar error to adapt a weight vector and (2.55a) uses 
a vector error signal to adapt a coefficient matrix. 

Proceeding in a similar fashion with equation 2.45b it is 
clear that: 

1) e^ n+ '*'\k) is analogous to the error signal; 

2) e (k-1) is analogous to the desired signal; 



3) e^ n ^(k) is analogous to the input signal vector. 

With the trace of p^ n+ D as a cost function, the time 
update relation for the backward reflection coefficient 
matrix is 

K (n+1> (k + l) = K (n+1) (k) + 2d ntl) e (n) (k) l (n+1, (k) T 

(2.55b) 



For a HA lattice, equation (2,47) must also be considered. 
Multiplying both sides of (2.47) by minus one and adding 
3 ^(k) results in 



-0 



(n+1) 



(k) = 



so 



<n) (k) - G (n+1) 5 <n+1) (k) 



(2.56) 



where £g^ n+ ^(k) is defined as in equation (2.50). It is 
evident that; 

1) e.g^ n+1 ^(k) is analogous to the error signal; 

2) £g^ n ^(k) is analogous to the desired signal; 

3) e^ n+ ^(k) is analogous to the input signal vector. 
With the trace of Pg^ n+ ^' > as a cost function, the time up- 
date relation for G^ n+ '*'^ is given by 



G (n+1) (k+1) 



G (n+1) (k) 



0 (n+1) — (n+1),, s 
2 p e (k) 

§ 



e 0 (n+1) (k) T 



(2.57) 
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It is significant to note that three different adaptive 
gains have been used in equations (2.55) and (2.57) and 
that the gains have been superscripted indicating that they 
vary from one lattice stage to the next. For stability consi- 
derations the adaptive gain used in the LMS algorithm must 
satisfy equation (2.52a) and therefore is related to the 
largest eigenvalue of the input autocorrelation matrix by 
equation (2.53a). In developing the time update relations 
for the lattice coefficients, three different input signals 
were used and these inputs also differ from one lattice 
stage to the next. Indeed, even for the case where the in- 
put x(k) to the lattice structure is stationary, the inputs 
to all lattice stages except the first are nonstationary 
since these inputs are outputs of other lattice stages. This 
fact indicates that time varying adaptive gains are appro- 
priate as well in equations (2.55) and (2.57). Equation 
(2.53a) is of no direct usefulness however in setting the 
adaptive gains since the time varying eigenvalues are not 
known. Recognizing that the largest eigenvalue is always 
less than the trace of the input autocorrelation matrix 
(which is a measure of the power in the input signal vector) 
the gains can be set as 



(n+1) .. . 

U (k) = 



a 



a , (k) 
n+1 



(2.58a) 






a 



(2.58b) 



— (n+1) 

v 



(k) 



n+1 



(k) 
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M (n+1) (k) = — 2 7 (2.58c) 

g Y Ck> 

n+1 

2—2 2 

where a ,-,00 , a , , (k) and y , , (k) are estimates of the 

n+1 n+1 'n+1 

power in the three input signal vectors and a is a normalized 
adaptive step size with 0 < a < 1, A method that has commonly- 
been applied to obtain these estimates is to employ a first 
order low pass filter so that 



a_ ,, (k+1) 2 = [l-o3a , , (k) 2 + ae (n) (k-l) T e (n) (k-1) (2.59a) 

n+1 n+1 — 

ct (k+1) 2 = [l-a]a , , (k) 2 + ae (n) (k) T e (n) (k) (2.59b) 

n+1 n+1 — — 

Y n+l (k+1)2 = [1 " a]Y n+l (k)2 + a --0 (n)(k)T -0 (n)(k) (2.59c) 

Taken together, equations (2.55), (2.57), (2.58) and (2,59) 
define the adaptive solutions for the AR and MA multichannel 
lattice models. 

To understand the potential advantage offered by the 
adaptive lattice form, recognize that while the conventional 
approach solves a high order minimization problem by adapting 
all the coefficients at once, the lattice breaks the problem 
down into a succession of lower order minimizations at each 
stage and solves these lower order problems adaptively. The 
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dimensionality of the input autocorrelation matrices at 
each lattice stage in general is significantly less than 
that of the large input autocorrelation matrix in the con- 
ventional adaptive algorithm and consequently it is hoped 
that the possibility of a large eigenvalue disparity with 
its attendant slow convergence is reduced. 

This advantage is most evident in a single channel 
adaptive lattice where the inputs at each stage are single 
signals and their corresponding autocorrelation matrices 
are lxl in dimension. In this case the ratio of smallest 
to largest eigenvalues is unity and the convergence of each 
stage is quite rapid while the convergence of the overall 
model is independent of the eigenvalue ratio for the over- 
all higher dimension input autocorrelation matrix. This has 
been demonstrated by Satorius [Refs. 46 and 47] who has 
shown that the single channel adaptive lattice converges 
much more rapidly than the corresponding conventional adap- 
tive filter, and does so independently of the eigenvalue 
ratio on the overall input channel autocorrelation matrix. 

Furthermore in a single channel adaptive lattice, the 
time update relations are simplified by the fact that the 
forward and backward reflection coefficients are the same. 
Using the average of the mean square values of backward 
and forward prediction errors as a cost function and applying 
an adaptive algorithm it follows that 



fia 



K (n+1) (k+l)=K (n+1) (k) + 



a 



a , Ck) 
n+1 



7 



[e <ri) (k) r e ' Cri+ 1 ) (k) 



+e <n) <k-l)e <n+1) (k)] (2.60a) 

where 

a (k+1) ^ = [l-a]a , (k) ^+^-[e^ n ^ (k) ^ + e^ n ^ (k-1) (2.60b) 

n+1 n+l l 

The very nature of the lattice structure however with 
the output of one stage providing an input to the next stage, 
greatly complicates the analysis of the convergence proper- 
ties of the adaptive lattice model. Even when x(k) , the 
input to the lattice, is stationary, inputs to all stages 
except the first are nonstationary. An approximate analysis 
of convergence and stability on a stage by stage basis is 
possible if it is assumed that all prior stages have con- 
verged and are providing stationary inputs to the stage 
under investigation. With this assumption, the adaptive 
solution for the K^ n+ ’*‘' > , K^ n+ ^ and G < ‘ n+ ^ matrices are 
obtained from the operation of three independent LMS algo- 
rithms as shown earlier, with inputs given by e v ; (k-l), 
e^ n ^(k) and e^ n+ ^^ (k), respectively . Stability limits on 
the adaptive gains used in the stage and the convergence 
properties of the stage are then determined by the eigen- 
values of the P^ n \ and p^ n+ D matrices. A more exact 

analysis of the properties of the adaptive lattice that 



considers the nonstationary character of the inputs to the 
second and subsequent stages is not currently available. 
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III. ARMA MODELING 



One of the most serious disadvantages of either AR or 
MA modeling is the fact that to adequately represent even 
simple linear systems, both methods may require a large 
number of parameters (a high order model) . This problem 
arises since, from a transfer function standpoint, AR and 
MA models attempt to model the system using only poles or 
only zeros, in spite of the fact that the physical system 
may have both zeros and poles. While modeling the effects 
of a zero with a number of poles and visa versa can be 
analytically justified as shown in the previous chapter, 
it makes far more sense (both from the viewpoint of model 
accuracy and efficient use of model parameters) to let the 
model represent the system as it really is with both zeros 
and poles if this is at all possible. The ARMA model is a 
generalization of the AR and MA models and accomplishes 
exactly this, representing the system in rational transfer 
function form. 

It is worth noting that the titles of all pole and all 
zero modeling that have been associated with AR and MA 
modeling are misnomers. Both have equal numbers of 
zeros and poles. In the AR model however, all the zeros 
occur at the origin of the z-.plane as do the poles of a MA 
model. The ARMA model removes these constraints. 
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After a brief discussion of two alternate ARMA modeling 
methods due to Shanks and Prony , the equation error formu- 
lation for ARMA modeling is developed and the new results 
presented. Model transition formulas relating the ARMA 
model to the MA and AR models are developed and the input 
signal requirements of the modeling process explored. It 
is shown that after suitable modification, the Levinson 
algorithm can be applied to solve the ARMA modeling problem 
recursively in order and lattice solution methods are also 
developed for both a batch processing and an adaptive model 
solution. The results of experimental simulations of both 
of these modeling solution methods are presented and dis- 
cussed, and comparisons are made with conventional means of 
ARMA modeling using the equation error formulation. Finally 
it is also shown that the lattice solution methods can be 
generalized to solve for the multichannel ARMA model with 
arbitrary numbers of inputs and outputs. 

A. LINEAR ARMA MODELING AND ITS RELATION TO AR AND MA 
MODELING 

The ARMA model for linear systems assumes the current 
value of the output of the system is given by a weighted 
combination of present and past values of the input and 
past values of the output. In terms of the discussions of 
Chapter I, F^g is assumed to be zero and F^g and F^g take 
on the following forms 

M 

F 10 Cu(k)] = <*(n) u(k-n) (3.1a) 

n = 0 
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F 20 Cy(k-l)] 



(3.1b) 



M 

- b(n) y(k-n) 

n=l 

leading to a transfer function representation for the system 
given by 

M 

clC n) z^ n 

n, s n-0 A ( z ) / « . \ 

HU) = jj = 'STzT <3 ' lc) 

1- b(n)z -n 

n=l 

A number of methods exist for finding the model coef- 
ficients {a(n) } and {b(n)}. As stated in Chapter I, a MMSE 
solution via the direct form modeling approach requires the 
solution of a system of highly nonlinear equations and in 
general is untractable. An alternative is to first obtain 
an estimate of the denominator polynomial B(z) by some 
means such as AR modeling and then using this in the system 
shown in Figure 3.1, estimate the numerator polynomial A(z) 
by setting its coefficients to minimize the mean square 
value of the error. This method was first explored by 
Shanks. [Ref. 49] 

Another alternative is to apply the Prony method [Refs. 

8, 52 and 56] derived in Appendix E which obtains the model 
parameters by matching the impulse response of the system 
and model over the first W+M+l sample intervals. Both of 
these techniques share a common characteristic. They both 
start by independently estimating the denominator coefficients 
(or model poles) and then, given this estimate, solve for 

7 7 



u(k) 




e( k) 



/ 

L 



J 



Figure 3.1. Shanks method for ARMA modeling 

the numerator coefficients (or model zeros). This is in- 
tuitively unappealing in that one would expect these two 
estimation problems to be more closely coupled with the 
zero estimates also affecting the estimates of the poles. 

The application of the equation error formulation to the 
ARMA modeling problem permits simultaneous estimation of the 
model zeros and poles and was first used by Kalman [Ref. 23] 
in work on self-optimizing control systems. The prediction 
error form of the model is considered here where is set 

to zero while 

M 

F.„[u(k)] = a(n) u(k-n) (3.2a) 

n= 0 

M 

F„[y(k)] = y(k)- b(n) y(k-n) 

1 n-1 



(3.2b) 




The analysis model is depicted in Figure 3.2 where 



M 

A(z) = ^2 a(n)z~ n (3.3a) 

n = 0 



and 

N 

B(n) = 1- J2 b(n) z^ n (3.3b) 

n=l 



and these polynomials are the estimates of the system trans- 
fer function numerator and denominator. 




E 



0 



( z) 



Figure 3.2. 



The equation error formulation for 
ARMA modeling. 



The expression for the model error can be written as 



T ! +, . T- 



e n (k) = y(k) - [y(k) ; u (k) ] 



(3.4a) 



where y(k) and b are defined as in equation (2.7) and 



u + (k) = Cy(k) • - • u(k-H)]^ 



(3.4b) 



a + = [a( 0 ) • • • a(M) ] T 



(3.4c) 



This results in an expression for the mean square error 
which is a quadratic function of the model coefficients, 
with the MMSE solution for those coefficients given by 
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(3.5a) 



(3.5b) 



1 . Model Transition Re l ationships 

Comparing equations (3.4) and (3.5) with their AR and 
MA counterparts, it is clear that the ARMA model provides a 
generalization of these other models since they can be 
obtained from the ARMA model by assuming that either a + or b 
is zero. Consequently it is susceptible to the same type of 



bias introduced in the AR and MA models by the presence of 
additive noise on either the system input or output signals. 
To develop the relationships between these models further, 
consider the inversion of the correlation matrix in equation 
(3.5a) in terms of its component matricies . Since the left 
and right inverses of a nonsingular square matrix are the 
same, either 
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(3.6a) 



(3.6b) 



can be used to find the required inverse in partitioned form. 
Solving for the right inverse of equation (3.6b) yields 



A = (R - R .R'i . R . )" 1 

— — vv — +— + + — + 

J J yu u u u y 



(3.7a) 



B = -R^ 1 R (R , , - R . R" 1 R J' 1 
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JJ ^yu uu uy^yu 



(3.7b) 
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( 3 , 7d ) 



Solving for the left inverse of equations (3.6a) gives iden- 
tical results for A and D while equivalent but different 
forms are obtained for B and C given by 
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yj yuuuuy yuuu 
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(3.3a) 



(3.3b) 



Using equations (3.7) in equation (3.5a), the solutions for 
the ARMA coefficient vectors are given by 
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— — y y — yu — + +— + — yy 
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(3.9a) 



and 
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(3.9b) 



The matrix inversion lemma [Refs. 11 and 19] which states 
that 

(E+FGH) -1 = E” 1 - E' 1 F(G" 1 +HE~ 1 F) _1 HE" 1 • (3.10) 
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for nonsingular square matrices E, G and E+FGH, can be 
used in equation (3.9) to rewrite them as 



b = R -1 r + R" 1 R ,(R . , - R , R _1 R i )" 1 

— — yy — yy — vy — + _ + + _ + _yy _ + 
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The all zero and all pole model solutions of corresponding 
orders however are 
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where subscripts are used to distinguish these solutions 
from their counterparts in the ARMA zero pole model. 

From equations (3.11) it follows that 
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Following a similar development, the left inverse relation- 
ships of equations ( 3 . 8 ) can be used to write 
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(3.12c) 
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Equations (3.12) are termed the "Zero Pole Model Transi- 
tion Formulas" and specify the relationships between the 
various models. It is interesting to note that equations 
(3.12a) and (3.12b) take the form of a linear observer with 
a new estimate of the solution given by the old estimate 
plus a gain times an error term. To gain some insight into 
the functioning of these formulas, consider the form of R^^(n) 
and R u y(n) for the linear system described by the transfer 
function of equation (3.1c) 

N M 

Ryy ( - n) = £ fa(i) Ryy(-i-n) + £ a(i) Ry U ( -i-n ) 

L=1 L=0 



( 3.13a) 
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V M 

R (-n) = y ' 6 ( i ) R (-i-n) + ^ ^ a(i) R (~i-n) 

uy / ^ uy / v uu 

L=1 i= 0 



(3.13b) 



Assuming that N=N and M=M, and writing equation (3.13a) for 
-N_<n_<-1 and equation (3.13b) for ~M_<n_<0 results in 



Rb+R,a + =r (3. 14a) 

-yy - - yu + - -yy 



* + 



b + 
u y - 



it t ! 
u u 



£ + 
u y 



( 3.14b) 



These constraints on the system input and output auto and 
cross correlation coefficients are the ARMA modeling equa- 
tions of (3.5a) with the model coefficients replaced by the 
system parameters. In AR modeling, b^p is set to satisfy 
the constraints of equation (3.14a) with the assumption that 
a + is zero. The error term in the model transition formula 
(3.12a) then checks this solution to see if it also satisfies 
the constraints of equation (3,14b) still assuming that a + 
is zero. 



error = R , b AP 
Vy " AP 



£ + 

u ^ 



(3.15) 
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If this error is zero and the constraints of equation (3.14b) 
are satisfied, equation (3.12a) sets b 7p = b^ p and equation 
(3.12d) sets a + to zero. If however the error is nonzero, 
(3.12a) adjusts b^ p in proportion to the error to obtain 
b^p and (3.12d) then provides a nonzero a* p . Thus equations 
(3.12a) and (3.12d) are complementary, specifying the ARMA 
or zero pole model solution of order M over N when given 
the N-th order AR or all pole model solution. 

In like manner, equations (3.12b) and (3.12c) give the 
ARMA model solution of order M over N when given the M-th 
order MA or all zero model solution. The all zero solution 
is obtained from equation (3.14b) assuming that _b is zero. 
Equation (3.12b) checks this solution against the con- 
straints imposed by equation (3.14a) with the same assump- 

. . + . .4* 

tions and adjusts a ^ appropriately to determine a^p . 

Equation (3.12c) then sets b^p, completing the zero pole 
model solution. 

2 . Modeling Input Signal Requirements 

Another aspect of the modeling problem that must be 
considered is that of system identif iability . If, from the 
available measurements of signals, a model can be obtained 
that accurately represents the system's operation, the 
system is considered identifiable. The two issues that 
arise therefore are the measurement requirements (which 
signals must be measured) and requirements on the input 
signal used to excite the system during the modeling 
process. In the equation error formulation of the ARMA 
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model, both existing signals (system input and output) must 
be observed (or at least a knowledge of their auto and cross 
correlation functions must be available) . Most discussions 
of input signal requirements for identif iability simply 
state that the system can be identified if the input signal 
is sufficiently rich, persistent or exciting, eg [Ref. 19] 

To explore the question of input signal requirements fur- 
ther, consider the mean square equation error cost function 
being minimized. Assuming that the equation error signal 
is ergodic and has finite energy its mean square value is 
obtained via time averaging as 

oo 

E2 = s{e(k) 2 } = e(n) 2 (3.16) 

n = -°° 

Applying Parsevals relation this becomes 

00 7 T 

E 2 = £ e (n ) 2 = J E(e j9 )E*(e j9 ) d9 (3.17) 

n = -°° -7T 

where 0 = uT and * indicates the complex conjugate. The 
equation error is represented in the transform domain as 

E(z) = [B(z)H(z) - A(z)]U(z) (3.18) 

and using this in equation (3.17), the cost function becomes 

7T 

/ 2 . 2 

| CB(e^ 9 )H(e^ 0 )-A(e 29 ) | ’ |U(e^ 9 )| d0 (3,19) 

-TT 



E 2 = 2n 



showing that the power spectrum of the input signal acts as 
a frequency dependent weighting function on a transfer 
function error term. Therefore to identify the system 
equally well at all frequencies, the input must have a 
flat spectrum as will be the case for a white noise input 
or an impulse function input. Otherwise, the model trans- 
fer function will only be matched to the system transfer 
function over the range of frequencies where the input 
signal has significant power. 

As an example consider the equation error ARMA 
model for a fourth order system driven by a single sine 
wave input at a frequency of it/ 3. According to equation 
(3.19), the model transfer function will only be required 
to match the system at this single frequency, and to accom- 
plish this, only a first order model is needed. Any in- 
crease in model order above first order therefore should 
have no effect. Figure 3.3a shows a comparison of the mag- 
nitude spectrum of the fourth order system and its first 
order ARMA model obtained using the sinusoidal input and 
as anticipated they match at the frequency tt/3 (coinciden- 
tally they also match at one other frequency as well). 

Figure 3,3b shows the same comparison but with a fourth 
order ARMA model. It is clear that increasing the model 
order failed to improve its accuracy and that the model 
accurately represents the system only at the frequency of 
the input signal and, by coincidence, at one other frequency. 
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(a) I H !«•) I DB VS «• — SYSTEM 

— 1 ST ORDER MODEL 




I H [■Q-) I Qg VS 'S’ — SYSTEM 

— 4 TH ORDER MODEL 



Figure 3.3. ARMA models for a system driven by 
a single sinusoid. 
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It should be noted that this type of analysis to 
determine input signal requirements could be applied to the 
AR and MA models as well resulting in the same conclusions . 

B. A RECURSIVE IN ORDER SOLUTION FOR THE ARMA MODEL 

Since the equation error formulation for the ARMA model 
is a generalization of similar formulations for AR and MA 
modeling, it is reasonable to assume that a Levinson-type 
algorithm could be devised to obtain the ARMA solution 
recursively in order, and that from that algorithm, lattice 
filter methods applicable to the ARMA modeling problem could 
be derived. Attempts to develop such an algorithm directly 
for the ARMA modeling equation (3.5a), however, fail to 
provide useful results. The first problem that arises is 
in deciding which model order to make recursive; the order 
of the numerator polynomial , the order of the denominator 
polynomial or both. If it is assumed that the numerator 
and denominator are of equal orders (M=N) , the ARMA modeling 
equations become as shown in equation (3.20). However, 
efforts to develop a Levinson-type algorithm for this system 
of equations, where the numerator and denominator polynomials 
of the model are incremented simultaneously to obtain re- 
cursive in order solutions, are still frustrated by the 
presence of the (N+l)-st row and column in equation (3.20). 
This arises because the numerator coefficient vector a + is 
a (N+l) -vector while the denominator coefficient vector b 
is a N-vector. If it is further assumed that the coefficient 
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a(0) is known in advance or can be estimated in some other 
fashion, equation (3.20) for the solution for the remaining 
2N coefficients becomes as written in equation (3.21). 
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(3.21) 



The coefficient a(0) essentially has the role of a gain for 
the model, and a method for estimating it after all the other 
coefficients are obtained (as was done in AR modeling) will 
be discussed later. Equation (3,21) can be written as 
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(3,22) 



Now consider the form of a two channel autoregressive 
model where the two input channels are y(k) and u(k) ; that is 



x^(k) = y(k) 



x (k) = u(k) 

Using equations (A, 4a) and (A. 7b) the two channel AR modeling 
equations are 
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and comparing equations (3.22) and (3.23) it is clear that 
with the exception of the gain term a(0), the ARMA modeling 
solution can be obtained from the two channel AR solution 
of (3.23). Furthermore equation (3.23) can be solved inde- 
pendently of the gain term via the Levinson algorithm as 
shown in Appendix A or via the multichannel lattice methods 
developed in the previous chapter. Then all that remains 
to complete the solution for the ARMA model is to estimate 
the gain term a(0) and solve for the other model coefficients 
using 
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(3.24) 
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From this it follows that the transfer functions A(z) and 
B(z) for the ARMA prediction error analysis model can be 
related to the two channel AR prediction error matrix poly- 
nomial transfer function by 
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= Cl - D(z>3 




-A(z ) 
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It is also easy to relate the ARMA equation error to the 
two channel AR prediction error vector. Using equations 
(A. 3) and (A, 5), the two channel AR prediction error vector 
is written as 
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(3.26) 
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Defining 



ip = 
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-a( 0 ) 



(3,27a) 
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and postmul tip lying equation (3.26) by ip and using (3.24) it 
follows that 



e(k)^ ip = y(k) - a(0)u(k) - [y(k)^t u(k)^] 



(3.27b) 



But from equation (3,4a) this is exactly the ARMA model 
equation error so that 



e Q (k) = e(k)^ \p 



(3.27c) 



and 



e{e Q (k) 2 } = i^ T P ip ( 3 . 27d) 

where P is the forward prediction error covariance matrix 
for the 2 channel AR model. Equation (3.27d) also provides 
a means of estimating the gain term a(0) after the two channel 
AR solution has been obtained by setting it to minimize the 
mean square value of equation error resulting in 

e{e (k)e (k)} 

a( 0 ) = ^ * (3.23) 

e{e u (k) } 

and completing the ARMA model solution. 
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The portion of the ARMA model solution in equation (3.24) 
given by the two channel AR solution can be found recursively 
in order using either the Levinson algorithm or the lattice 
filter techniques. If the desired ARMA model order is not 
known in advance, a model gain term a(0) can be estimated 
for each order two channel AR solution to find the ARMA 
model of corresponding order along with its MSE . In this 
fashion, the entire family of ARMA models for the system from 
order zero to order N, along with their mean square errors 
are obtained. If, on the other hand, the desired ARMA model 
order is known apriori, the gain term need not be calculated 
at each stage . Only one gain term must be calculated to 
obtain the ARMA solution after the appropriate order two 
channel AR solution has been found. 

It has already been shown that to fully identify the 
system using the equation error ARMA formulation, the input 
signal must have a flat spectrum as in the case of white 
noise. When white noise is used as the system input u(k) , 
simplifications emerge in the solution of the two channel 
AR model via the Levinson algorithm or lattice methods. 

. 2 

For a white sequence with variance , 
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and 
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(3.29b) 



where h(n) is the sampled impulse response of the system. 
Consider equation (A. 16c) for 
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(3.29c) 



This shows that k^^ an< ^ ^2 2^ are zero an( ^ furthermore since 

(n) . (n) r: , , . . . . , 

r and r are zero for all n it is seen that 
— yu — uu 



K 12 *22 U 



( 3 . 29d) 



for all n as well. This can readily be understood by con- 
sidering the role of these two coefficients at each stage 
in the AR prediction of y(k) and u(k) . k^ and k ^2 the 

coefficients used in trying to predict u(k) from past values 
of y(k) and u(k) , and when u(k) is a white sequence it 
cannot be predicted, forcing these coefficients to be zero, 
No such simplifications occur in the backward prediction 
problem (and therefore in the K" matrices) since even for 
a white u(k) , a backward prediction of u(k-n) from subse- 
quent values of u and y is possible. This is because in 



general, a linear dependence of y(k) upon past and present 
values of u(k) can occur (and certainly will occur when the 
relationship between y(k) and u(k) is described by an ARMA 
model) , 

As a result of these simplifications, it is seen from 
equation (A. 19a) that the polynomials d^ 2 (z) and d 22 (z) are 
zero when u(k) is white and the ARMA model is given by 



B ( z ) 




i-d yz) 


0 




1 


-A ( z ) 




_' d 21 (z) 


1 




-a( 0) 



In this special case it follows that 

B ( z ) = det[I -D (z>] (3.31) 

and therefore stability of the ARMA model and of the two 
channel AR model are equivalent. In general, however , no 
such connection exists for arbitrary input signals. Further- 
more, even when the system input u(k) is white, solutions 
for k^ 2 and k 22 will not in general be exactly' zero since 
the required correlations are usually not known and must be 
estimated . 

This development showing that the ARMA model solution 
can be obtained from a two channel autoregressive model de- 
pends on two assumptions ; 

1) The numerator and denominator polynomial orders in 
the model transfer function are assumed to be the 



qu 



same and are incremented simultaneously to build up 
the desired solution recursively in order; 

2) It is assumed that the coefficient of z to the zero 
power (a(0)) in the numerator polynomial of the model 
either known in advance, or that another means of 
estimating it can be found so that it need not be 
estimated directly in the modeling equations of 
(3.20) , 

The second assumption causes no concern since the two channel 
autoregressive solution is obtained independently of a(0), 
and given that solution, it has been shown that a(0) can 
indeed be estimated in another fashion in equation (3.28). 

The first assumption however, warrants further consid- 
eration since it seems somewhat restrictive (at first) to 
require that the numerator and denominator polynomials of 
the model have the same order when in fact, the system being 
modeled may have different order numerator and denominator 
polynomials. To see that this assumption is not restrictive 
in general, consider what is occuring as the model is built 
up recursively in order. At each model order n, the pro- 
cedure finds the best n-th order model (with n zeros and n 
poles) in a minimum mean square equation error sense. 

Making the model numerator and denominator orders different 
(or equivalently, forcing some of the coefficients to zero 
in the model where the orders are the same) places a priori 
constraints on the model, forcing some of the poles or 
zeros to the origin in the z-plane, rather than allowing the 



model to place them at will to minimize the cost function. 

As an example, consider the process of obtaining an ARMA 
model for a system given by 

2 

£ a(n) z‘ n 

H( z) = 

1- £ b(n) z" n 
n=l 

where two of the 'systems four zeros actually occur at the 
origin of the z plane. Constraining any of the model zeros 
to the origin at orders one, two or three will result in a 
model with higher cost (MSE) than if they were not con- 
strained. Even at order three, a model without constraints 
can be expected to use the "extra" zero to help in approxi- 
mating the effects of the system’s fourth pole as shown in 
equation (2.13) yielding a lower cost and more accurate 
model than would result 'if one zero were forced to the origin. 
Only at order four are such constraints reasonable but even 
then, they are not necessary since the modeling procedure 
itself should recognize that the best fourth order model 
will have two zeros at z=0. 

Therefore, it is seen that assuming equal orders for the 
model numerator and denominator is entirely reasonable as a 
general approach in obtaining MMSE models for unknown sys- 
tems or even reduced order models for known systems. When it 
is known in advance that the best MMSE model for a system 
has zeros at z=0, imposing such a constraint on the model can 
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reduce the computational complexity of obtaining the solution, 
but even here the assumption of equal orders is not restric- 
tive . 

C, LATTICE FORM ARMA MODELING 

In chapter two it was shown that the lattice structure 
of Figure 2,7 could be applied to solve the multichannel 
AR modeling problem in terms of the reflection coefficient 
matrices given by equations (2.49). Since the ARMA model 
solution can be obtained from a two channel AR solution with 
y(k) and u(k) as the input channels, only the structure 
described by equation (3,27c) need be added to a two channel 
AR lattice to obtain the lattice form of the ARMA analysis 
model. It is interesting to look at the exact structure of 
this lattice model as shown in Figure 3.4 for a second order 
case. This structure is seen as a lattice interconnection 
of two single channel AR lattices operating on the input 
signals y(k) and u(k). The coefficients on the main diag- 
onals of the K and K matrices specify the single channel 
lattices while the off diagonal elements specify the inter- 
connections. (This will also be the case for extensions 
to lattices with any number of channels.) 

The ARMA synthesis model implementing the transfer func- 
tion A(z)/B(z) can also be put in lattice form. The for- 
ward prediction in the two channel AR analysis lattice is 
described by 
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(3.32) 



and the equation for e^(k)^ n+ ^ can be rewritten as 



e (k) (n) = e (k) (n+1) + k, n (n+1) i (k-l) (n) + k 91 (n+1) i (k-l) (n) 
y y 11 y 21 u 



(3.33) 



Equation (3.33) along with the equation of in (3.32), and 

the equations for the backward prediction (2.45b) describe 

the structure shown in Figure 3.5 for a second order case. 

( 2 ) 

To provide the required input at e y (k) , recognize from 
equation (3.27c) that 



e y (k) (2) = e Q (k) + a(0)e u (k) (2) 



(3.34a) 



If the ARMA model is an accurate representation of the 
system, the equation error e^(k) will be quite small (ideally 
zero) so that in general for the N-th order case 



e (k) (N) . a( 0 ) e (k) (N) 

y u 



(3 . 34b) 



This is indicated by the dashed feedback path in Figure 3.5. 
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Figure 3.5. Lattice synthesis structure for ARMA modeling. 



D. ARMA MODEL SIMULATIONS; BATCH PROCESSING 



ARMA modeling procedures for linear systems using both 
the lattice filter method and a brute force matrix inver- 
sion with equation (3.20) have been implemented and the 
results of these two approaches have been compared in over 
a thousand model simulations of more than thirty different 
linear systems. The experimental results which follow are 
a representative sampling of these simulations. 

In the lattice filter method, equations (2.49) were used 
to calculate the forward and backward reflection coefficient 
matrices , with time averages over a specified interval used 
to estimate the required correlations in = P ^ ^ and 

for 0 _< n _< N-l. Equations (A. 15), (A. 16a) and (A. 16b) 
were used to obtain the two channel AR model coefficients 
from the K and K matrices and with the gain calculated 
in equation (3.28), equation (3.24) was used to obtain the 
desired ARMA model coefficients. Equations (A. 18) were 
used to update the forward and backward prediction error 
covariance matrices from one lattice stage to the next. 
Figure 3.6a provides a flow diagram of the procedure. 

In the brute force matrix inversion method with equation 
(3.20) time averaging was again employed to estimate the 
required correlation coefficients. A rectangular window 
was applied to the data, however, to retain the even symmetry 
of the autocorrelation function in these estimates. The 
ARMA model coefficients were then obtained using a general 
purpose library subroutine (which employed gaussian 
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elimination) to solve equation (3.20). Figure 3.6b provides 
a flow diagram of this procedure. 

In both cases zero mean, unit variance gaussian white 
noise was used as the system input. In the simulation re- 
sults that follow, a description of each system discussed 
(transfer function coefficients, zero locations and pole 
locations) is listed in tabular form and root locations in 
the z plane as well as transfer function magnitudes are 
plotted for the system, and various models obtained for it. 

In each case, models were obtained for averaging intervals 
of 200, 500, 1000, 2000 and 4000 data points. Only the 
results for the two extremes of 200 and 4000 points are 
included here. 

The first system considered has asecond order numerator 
in z inverse and a fourth order denominator, and its char- 
acteristics are listed in Table 3.1. Figure 3.7 shows a 
comparison of the root locations and transfer function 
magnitude of this system with those of fourth order lattice 
filter and brute force models obtained when correlations 
were estimated by averaging over 200 samples. Figure 3.8 
provides the same comparison for a longer averaging interval 
of 4000 samples of data. While both methods perform com- 
parably with the longer averaging interval, the lattice 
method produces a far more accurate model with the short 
averaging interval . It is also interesting to compare the 
performance of the two methods when the system is overmodeled; 
that is , when the model order is increased beyond that of 
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Figure 3.6a. Flow diagram of batch processing 
lattice solution for all ARMA 
models orders 1 to N . 
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Figure 3.6b. Flow diagram of batch processing brute 
force solution for all ARMA models of 
order 1 to N. 
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TABLE 3.1 



SYSTEM A 



TRANSFER FUNCTICN COEFFICIENTS 



NUMERATOR 


DEN0MI N 4T0R 


A ( C ) = 


0.25C00 




A (1 ) = 


0.35000 


B( 1) = 1.14000 


A ( 2 ) = 


0.24500 


8(2) = -1.45490 


11 

ni 


o 

• 

o 


B(3> = 0.68490 


A (4) = 


o 

• 

o 


8(4)= -0.40745 






ROOT LOCATIONS 




ZEROS 


POLES 


RE 


IM 


RE IM 


G.7CC00 


C.7000C 


0.50000 0.50000 


0.7CC00 


-0. 70000 


0.5000C -0.5000C 


C.C 


O 

• 

o 


0.07000 0.90000 


o 

• 

o 


o 

• 

o 


0.07000 -0.90000 
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Figure 3.7 



106 





Figure 3.7 con't 
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Figure 3 , 8 
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Figure 3 . 8 con r t 
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the system. Figures 3.9 and 3.10 show such a comparison 
when sixth order models are obtained for this fourth order 
system. Ideally, the extra zeros and poles should lie at 
the origin in the z plane making the additional transfer 
function coefficients zero. Figure 3.9 shows that for a 
200 point averaging interval, the lattice locates the extra 
roots in the vicinity of the origin while the brute force 
model does not. When the averaging interval is increased 
to 4000 points, the extra roots of the lattice model move 
in toward the origin while those of the brute force model 
do not. Instead, a zero pole cancellation at some arbitrary 
location occurs in the brute force model. In all cases 
investigated during this effort, the lattice method clus- 
tered the extra roots in the vicinity of the origin and as 
the averaging interval was increased to take in more data, 
these roots were consistently moved in closer to the origin. 
This property is further evidenced by the plot of mean square 
equation error as a function of model order shown in Figure 
3.11 for a 200 point averaging interval. The MSE for the 
lattice model flattens out at order four indicating that 
further increases in model order fail to increase its 
accuracy. Meanwhile the MSE for the brute force model con- 
tinues to decrease beyond fourth order as it uses the addi- 
tional roots to reduce modeling errors caused by innaccuracies 
in the fourth order model. 
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Figure 3.9 
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F igure 3 . 9 con ' t 
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Figure 3,10 





Figure 3.10 con't 




Figure 3,11. Mean square value of equation error 
(as a percentage of the mean square 
value of system output) vs model order 
for lattice and brute force models of 
system A with 200 point averages. 
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To investigate the ability of these modeling methods to 
distinguish roots located near one another in the z plane, 
a pair of zeros were added to the previous system in close 
proximity to one of the pole pairs. The characteristics of 
this system are listed in Table 3.2. Figures 3.12 and 3.13 
show the lattice method and brute force modeling results for 
200 and 4000 point averaging intervals. With data over only 
200 sampling instants, neither method is able to accurately 
model the effects of the adjacent roots. When the averaging 
interval is increased, the lattice correctly models the 
plant while the brute force method does not, and even results 
in an unstable model. (Figure 3.13b comparing the transfer 
function magnitudes has been plotted in spite of the model 
instability . ) 

To investigate the ability of these zero pole modeling 
methods to model systems that are actually all zero or all 
pole, the systems listed in Tables 3.3 and 3.4 were used. 

The modeling results are shown in Figures 3.14, 3.15, 3.16 
and 3.17. 

The conclusions drawn from the results of this experi- 
mental study are as follows: 

1) For short data lengths, the lattice filter method 
provides more accurate models than does the brute 
force modeling method. 

2) When the system is overmodeled using the lattice 
method, the excess roots are - clustered about the 
origin in the z plane and as the averaging interval 
is increased and more data taken in, these roots move 
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TABLE 3.2 



SYSTEM B 



TRANSFER FUNCTICN COEFFICIENTS 
NUMERATOR DENOMINATOR 



A ( C ) = 


1. COCOO 






A ( 1 ) = 


1.34000 


8( 1) = 


1.14000 


A ( 2 ) = 


1.79940 


8(2) = 


-1.45490 


A (3i = 


1.2060C 


5(3) = 


C. 88490 


A (4) = 


0.88533 


9(4) = 


-0.40745 






ROCT LOCATIONS 






ZEROS 




PCLES 


RE 


IM 


RE 


I w 


7CCCC 


0.7000C 


0.50000 


0.5000C 


7CC0C 


-0. 700CC 


0.50000 


-0.50000 


C2COO 


0.9500C 


0.07000 


C.90CCC 


C2CCC 


-0.9500C 


0.07000 


-0.9000C 



117 



30 



1 



T 




(b) 




I H («•) I 0B VS e — SYSTEM 

4 TH ORDER BRUTE FORCE MOOEL 

200 POINT AVERAGES 



Figure 3.12 
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Figure 3.12 con't 
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Figure 3.13 
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Figure 3.13 con't 
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TA8LE 3.3 



SYSTEM C 

TRANSFER FUNCTION COEFFICIENTS 
NUMERATOR DENOMINATOR 

A ( C ) = l.COCOO 

A ( 1 ) = -1.50000 B( 1) = C.C 

A ( 2 ) = C. 62500 0( 2 )= O.C 



ROOT LOCATIONS 

ZEROS 

RE I M RE 

COO 0.25000 0.0 

C.75CCC -0.2500C 0.0 



FCLES 



IM 



0.0 

0.0 



TABLE 3.4 



SYSTEM 0 

TRANSFER FUNCTION 
NUMERATOR 
A < C ) = l.COCOO 



COEFFIC IENTS 
DENCMIN ATCR 



A ( 1 ) = 0.0 

A ( 2 ) = C.O 



8( 1)= 1.50000 

0(2)= -0.62500 



ZEROS 



RE IM 

0.0 0.0 

0.0 0.0 



ROOT LOCATIONS 



POLES 

RE IM 

0.75000 0.25000 

0.75000 -C. 25000 



122 




I H (■&) I DB VS e — SYSTEM 

2 NO ORDER LATTICE MOOEL 

200 POINT AVERAGES 




I H («) I 0B VS e — SYSTEM 

2 NO OROER BRUTE FORCE MODEL 

200 POINT AVERAGES 



Figure 3.14 
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Figure 3.14 con’t 
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Figure 3.15 
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Figure 3.15 con’t 
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Figure 3.16 
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Figure 3.16 con f t 
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Figure 3,17 
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Figure 3.17 con ' t 



consistently in toward the origin. This also forces 
the excess transfer function coefficients and re- 
flection coefficients to be very small (ideally zero), 
clearly indicating that the model order is higher 
than necessary. The brute force method, on the other 
hand, scatters the excess roots throughout the z 
plane and produces cancellations of the excess zeros 
and poles. This results in nonzero values for the 
excess transfer function coefficients. 

3) The MSE as a function of model order is generally 
better behaved for the lattice method than for the 
brute force method, decreasing rapidly until the 
correct model order is reached and then failing to 
decrease substantially as the model order is increased 
further . 

Two qualitative explanations for the improved performance 
of the lattice filter modeling method are offered. The 
first is that the data is not windowed in the lattice method 
while a window that is nonzero only over the span of the 
averaging interval must be used in the brute force method to 
maintain even symmetry in the autocorrelation function esti- 
mates. The effects of this difference between the two modeling 
methods should be most noticeable for short data lengths, and 
become less evident as the length of the averaging interval 
is increased. The second possible cause for the lattice 
method's better performance is that the actual output se- 
quences of the n-th order lattice are used to calculate the 
coefficients at the (n+l)-st order stage. In this manner. 



modeling errors that have occurred in the n-th order lattice 
can be compensated for to some extent in the (n+l)-st order 
stage. No similar phenomenon is evident in the brute force 
modeling approach. Consider the differences between the 
Levinson algorithm (which is equivalent to the matrix inver- 
sion method) of equations (A. 16) and the lattice method given 
by equations (2.49). Both methods calculate the corrections 
that must be made to the n-th order model to obtain the 
(n+l)-st order model in terms of K^ n+ '*'^ and K^ n+ '^ and 
theoretically , 

e{e (n) (k) e (n) (k-l) T } = R_ xx (n+1) T - (D (n) f) (n) 

When correlations are estimated by averaging over finite 
intervals however this equality will not in general be 
satisfied making the two methods different. The Levinson 
algorithm will estimate the correction terms to be added to 
the optimum n-th order model while the lattice method esti- 
mates the correction terms to be added to the estimated n-th 
order model actually obtained. 

The improved performance of the lattice method is not 
achieved without cost, however. The method is made compu- 
tationally expensive by the need to store the system input 
and output sequences and the lattice prediction error se- 
quences and pass them through successive stages of the 
lattice as it is built up in order. The computational com- 
plexity of the two methods is compared in Table 3.5. 
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Table 3,5. 



Computational requirements for batch 
processing ARMA modeling of an N-th 
order system using P samples of the 
system input and output. 



Number of 
Correlation 
Estimates 
Required 


4N+3 averaged over 
P data points 


4N+3 


Matrix 

Inversions 


1 - dimension 2N+1 


2N-dimension 2 
1-dimension 1 


Data Storage 
Requirements 


N.A. 


2P samples 


Computations 
To Pass Data 
Through The 
Filter 


N . A. 


8NP multiplica- 
tions 

4NP additions 
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E. ADAPTIVE LATTICE ARMA MODELING 

In addition to the batch processing method described in 
the previous section, the lattice ARMA analysis model can be 
implemented adaptively as well. The adaptive lattice solu- 
tion for the multichannel AR lattice, which solves most of 
the ARMA modeling problem, has already been described in 
chapter II. To make the lattice ARMA model adaptive, only 
an adaptive solution for the gain term a(0) need be added. 

To avoid ambiguity, the time varying adaptive estimate of 
this term at time k is denoted by a^(k). Applying an LMS 
adaptive algorithm it follows that 



and using equation (3.27d) to form an instantaneous estimate 
of the gradient yields 



a Q (k+l) = a Q (k) - V(k) 



(3.35) 



V(k) = -2 e^(k) [ e^ (k) -a^ (k) e^(k) ] 



(3.36) 



-2 e (k) e (k) 
u o 



so that 



a Q (k+l) = a Q (k) + 2 e u (k) e Q (k) 



(3.37) 



134 



Here it is clear that 

1) e g ( k ) is analagous to the error signal, 

2) e u (k) is analagous to the input signal. 

3) e^(k) is analagous to the desired signal, 

so that for stability pg must satisfy 

0 < \i n < - * — (3.38) 

0 e{e u (k) 2 } 

Once again, however , the mean square value of e^(k) may vary 
from stage to stage and over time as the two channel AR 
lattice adapts making it appropriate to apply relations 
similar to equations (2.58) and (2.59) 

"o (k) -• a^TkT (3 ' 39a) 

O 0 (k) = (1-a) a Q (k-1) + a e (k) 2 (3.39b) 

where a is the normalized adaptive step size and the depen- 
dence on order is implicit (a superscript (n) could be used 
on a^ , Ug , V, and all the error terms in equations (3.35) 
through (3.39) to explicitly denote their dependence on the 
order of the solution) , 

This adaptive lattice ARMA modeling scheme was implemen- 
ted and the results of its use in modeling system A described 
in Table 3.1 are presented here. A normalized adaptive step 

l 

size of a = .05 was used in the following simulations and 
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the results represent an ensemble average of one hundred 
trials. Unity variance white noise was used as the system 
input. A flow diagram of the procedure is shown in Figure 
3.18. 

Figure 3.19 shows a plot of the mean square equation 
error as a function of time for a fourth order model while 
it adapts and Figure 3.20 shows the behavior of one term in 
the K matrix, k^^ , and one term in the K matrix, k^i^ at 
the first five lattice stages, 1 _< n _< 5 . These graphs of 
the k.^ terms clearly show the successive stage by stage 
manner in which the lattice model adapts. Figure 3.21 shows 
a comparison of the transfer function magnitude and root 
locations of the system with those of the fourth order 
adaptive model after 1000 and 2000 iterations. Figure 3.22 
shows the same comparison in the overmodeled case when a 
sixth order model is obtained for this fourth order system. 

While these results show that the adaptive solution 
performs well and is a viable alternative to the batch pro- 
cessing solution. Figures 3.22c and 3.22d show that one 
advantageous characteristic of the batch solution has not 
carried over. The excess roots in the overmodeled case are 
not tightly clustered in the vicinity of the origin indi- 
cating that the excess transfer function coefficients are not 
near zero. In examining Figure 3.20 it is also evident that 
the convergence of the reflection coefficients at the 
overmodeled lattice stage (k^lp and k^^ ) towards their true 
value of zero is quite slow. This relatively slow tracking 
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Figure 3,18, Flow diagram of adaptive lattice ARMA 
solution . 
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Figure 3.19 

Mean square value of equation error for the 
fourth order adaptive lattice model as a 
function of time. 
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Figure 3.20a 

K(l,l) term from the forward reflection 
coefficient matrices at the first five 
stages of the adaptive lattice model as 
a function of time. 
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TIME 

Figure 3.20b 

K(l,l) term from the backward reflection 
coefficient matrices at the first five 
stages of the adaptive lattice model as 
a function of time. 
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Figure 3 . 21 
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of overmodeled, zero valued parameters has been found to be 
a general characteristic of the adaptive lattice algorithm 
and has also been noted briefly by Morf. [Ref. 38] 

To understand the reason for this behavior, consider 
what occurs as the overmodeled fifth stage in the previous 
example adapts. Initially, before the coefficients of the 
first four lattice stages converge to their optimum values, 
the prediction error sequences out of the fourth stage are 
large and suboptimum. These signals provide incorrect in- 
puts to the fifth stage driving its parameter estimates to 
some values other than their optimum zero values. As the 
first four stages converge , the prediction error sequences 
going into the fifth stage get small and since they drive 
the gradient estimates , convergence back toward zero is 
quite slow in these parameter estimates. 

The cost functions being minimized at the overmodeled 
fifth stage are the trace of P ^ ^ and P ^ ^ given by 

P (5) = P (t+) . a ( 4) K (5) - K (5)T A (4)T + K (5) T p( 4) K (5) 

(3.40a) 

p«> = P< 4 > - a (4) V 5) - K (5)T A (4) ♦ K (5)T P (4) K (5) 

( 3.40b) 

Applying the results of Appendix F, the parabolic surfaces 

defined by these cost functions are described by the eigen- 

( 4 ) — ( 4 ) 

values and eigenvectors of P and P , the prediction error 
covariance matrices at the fourth stage. Consider the forward 
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predictions . The actual system output is given by 



4 4 

y(k) =. ^2 b(i) y(k-i) + £ a(i) u(k-i) (3,41) 

L=1 i=0 

and for a white input signal, the minimum errors in the 
fourth order two channel autoregressive predictions of y and 
u are 



e (4) (k) = a( 0 ) u(k) (3.42a) 

y 



e 



(4) 

u 



(k) 



u(k) 



(3.42b) 



This results in an optimal prediction error covariance matrix 
given by 



P 



(4) 



a( 0 ) 2 



a( 0 ) 



a( 0 ) 



R (0) 
uu 



(3.43) 



2 

with eigenvalues of 0 and l+a(0) , Also in the case of the 
system described by Table 3.1 where a(3) = a(4) = 0, it is 
seen from equation (3.41) that a perfect backward two channel 
AR prediction of y(k-4) is possible resulting in an optimal 
backward prediction error covariance matrix of 




0 



e{i^ 4) (k)} 



(3.44) 



0 



which also has a zero eigenvalue. 

Since the ellipses obtained by passing a plane through 

the parabolic cost surface have axes whose half lengths given 

by 1/ AT and since P and P each have one eigenvalue of 

( 5 ) 

zero, it is seen that the parabolic bowls along which K 
and ^ adapt, degenerate toward infinitely long parabolic 
troughs as the first four stages converge toward their opti- 
mum values. This is responsible for the slow convergence of 
the overmodeled parameters back toward zero. To avoid this 
problem, some means of detecting this degeneration of the 
cost surface and then reseting the appropriate parameters 
to zero must be found and this certainly provides an inter- 
esting area for future study. 



F. A LATTICE APPROACH FOR MULTICHANNEL ARMA MODELING 

The lattice filter solution methods for the ARMA model 
can readily be generalized to multiple input multiple output 
ARMA models for linear systems. The equations for the 
multichannel ARMA model of the system shown in Figure 3.21 
are developed in Appendix G with the solution for the model 
coefficients given by equation (G.7) repeated here for 
convenience . 
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(3,45) 
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Figure 3.21. A general multichannel ARMA system 



This is clearly a generalization of the single channel ARMA 
equation error solution given by equation (3,20) and just 
as in the single channel case some assumptions are required 
to apply a Levinson type algorithm. 

If it is assumed that all the a..(0) are known or can 
be estimated in another manner, they can be incorporated 
into a matrix given by 



a, , ( 0 ) .... a, ( 0 ) 

\ 



-0 



a^-v , ( 0 ) , . . , a~. ( 0 ) 



(3.46) 



and equation (3.45) becomes 



-YY 


-YU 
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-Yy 


R v 

—Yu 
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^JY 


-UU 
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R tt 

j-Uy 


R, r 
— Uu 




1 

0 

<1 
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_l 



(3.47) 
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This is similar in form to the equations for the solution of 

a (Q^ + Qq) channel autoregression with input channels y-^(k), 

. . . , y n (k) , u, (k) , . , . , u„ (k) so that the multichannel 
^0 1 ^0 

ARMA solution is related to the multichannel AR solution by 



B 


= D 


i 


A 







The multichannel AR prediction error vector is given by 



e ( k) T 



— — 1 


T 




y (k) 




e (k) 




- cy t ! u T ] D = 


-y 






u(k) 




iu (k) _ 



( 3 . 49 ) 



and defining 



<P = 



I 




( 3 . 50 ) 



it follows that 



e(k) ip = ^(k) - u(k) A f 





( 3 . 51 ) 



= e Q (k) 



T 
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establishing the generalization of equation 3.27c relating 
the multichannel AR prediction error vector to the multi- 



channel ARMA error vector e^OO, The ARMA prediction error 
covariance then is given by 



p Q = i T £ i 



(3.52) 



and the coefficient matrix A^ can be set to minimize the 
trace of resulting in a solution given by 



completing the multichannel generalization of the single 
channel results. The portion of the solution given by the 
multichannel autoregression can be solved as before using 
the lattice methods in either batch or adaptive fashion. 

Then the matrix of gains can be obtained from equation (3.53) 
by batch processing or equations (3.37) and (3.39) can be 
generalized to yield an adaptive solution given by 



A n = e (e (k) e (k) } e {e (k) e (k) } 
— 0 — u — u — u — y 



(3.53) 



A Q (k+1) = A Q (k) + 2 




e (k) e n (k) T 
— u —0 



(3.54a) 



where 



a?(k) = (1-a) (j? (k-1 ) + a e (k) T e (k) 
0 0 — u — u 



(3,54b) 
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It is clear that the equations and methods developed earlier 
for the single channel ARMA model are a special case of 
these results with = 1. 
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IV. NONLINEAR SYSTEM MODELING 



The modeling of nonlinear systems is a far more complex 
problem than linear systems modeling. No attempt is made 
here to provide a comprehensive treatment of the problem. 
Rather, two specific models for systems comprised of the 
interconnection of linear and memoryless nonlinear subsys- 
tems are considered. Both of these models, the Volterra 
model and the new nonlinear ARMA model, are shown to be 
generalizations of the MA and ARMA modeling problems explored 
previously so that with appropriate modifications, the 
Levinson algorithm and lattice methods can be used to solve 
for the model parameters. 

A. VOLTERRA NONLINEAR MODELING 

The Volterra series model characterizes nonlinear sys- 
tems using a generalization of convolution where the system 
output is approximated as a summation (possibly infinite) 
of outputs of degree m systems . 

M 

y(k) = £ y m (k) (4.1a) 

m=l 

This is shown pictorially in Figure 4.1. 
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Figure 4.1. The Volterra model for nonlinear systems 

A number of representations for these m-th degree systems 
are possible. The most commonly used representation is in 
terms of symmetric Volterra kernels and is given by 

OO 00 

y (k) = Y .... Y a (n, . . . n ) u(k-n 1 ) . . . u( k-n ) (4.1b) 

m _ “n _ 4rV, s 1 m 1 m 

n - u ru - u 

m 1 

and a (n, . . .n ) is the m-th degree symmetric Volterra kernel, 
s -L m 

(Any permutation of the indicies results in the same value 
for this kernel giving a high degree of symmetry.) 

This model arises quite naturally for a linear system in 
cascade with a power series nonlinearity as shown in Figure 
4.2 for a quadratic nonlinearity where the output can be 
written as 
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00 



oo 



(4.2a) 



y(k) = S 2 a(n^)a(n )u(k-n^)u(k-n 2 ) 

n l=0 n 2 =0 



and 



a (n,n„) = a(n.)a(n 0 ) 
s 1 2 1 2 



(4.2b) 



u(k) 




Figure 4.2. A quadratic nonlinear system. 



The Volterra series model in this form has been widely dis- 
cussed in the literature [e.g. Ref. 1, 6 , 13 , 14 , 25 , 26 , 48 
and 54] since Wiener [Ref. 62] first applied it to systems 
analysis and modeling. It has the added benefit of treating 
linear systems as a special case of the model since the 
first degree kernel is exactly the convolutional represen- 
tation of the MA model. 

The number of kernels (M) required for an accurate model 
depends on the nature of the nonlinearity in the system and as 
long as the nonlinearity is soft, a relatively low degree 
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model will suffice, keeping the problem manageable in that 
regard. The primary difficulty associated with Volterra 
nonlinear modeling arises from the fact that it uses a 
nonrecursive MA representation for the linear portion of 
the system, requiring in general, an infinite memory as 
indicated by the upper limits on the summations in equation 
(4.1b). In practice, these summations need to be truncated 
as shown in equation (4.3) 

N N 

y (k) = £ . . . £ a (n,...n )u(k-n, ) . . .u(k-n ) (4.3) 

J m s 1 m 1 m 

n, =0 n =0 
1 m 

but a large number of terms may still be required to 
accurately model the system. One method of solving for the 
model is to set the parameters (terms of the Volterra kernels) 
to obtain a minimum mean square equation error where the 
equation error is defined as the difference between the sys- 

A 

tern output and y(k). 

To simplify the solution and reduce the number of para- 
meters that must be obtained, the symmetry of the kernels can 

/ 

be exploited by rewriting equation (4,1b) as 

CO 00 00 

y (k) = £ . £ a. (n, , . .n )u(k-n, ) . . . u(k-n ) 

n =0 n 0 =n, n =n , 

1 2 1 m m-1 

(4.4) 

where a^(n^ . . .n^) is the m-th degree triangular Volterra 
kernel. For a finite upper limit of N on the summations 
the m-th degree symmetric kernel will contain (N+l) m terms 
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but only ) of them are distinct with the remainder de- 

termined by symmetry considerations. 

Still another representation for y (k) uses the regular 
form of the Volterra kernel (this terminology has recently 
been introduced by Mitzel, Clancy and Rugh [Refs. 7 and 35]). 

It is given by 

00 00 

y (k) = Y ' . . . y* a (h, ...h )u(k-h, )u(k-h, -h„) 
m , rim 1 12 

h, =0 h =0 
1 m 

... u( k-h, - . . . -h ) (4.5) 

1 m 

where a (h^...h m ) is the m-th degree regular Volterra kernel. 
With infinite upper limits on the summations, the symmetric, 
triangular and regular forms of the Volterra kernels are 
equivalent, however, when finite upper limits are used they 
cover the field of the model kernels in different ways. 

Because of its symmetry, it is reasonable to have equal 
upper limits on all the summations on the symmetric kernel 
as was done in equation (4.3). This is shown in Figure 4.3a 
for a second degree kernel. The equivalent triangular kernel 
is given by 

N N 

y (k) = T' ... T. a. (n, . , .n )u(k-n ) , . .u(k-n ) (4,6a) 

-'m t 1 m 1 m 

n, = 0 n =n , 

1 m m-1 

and it covers the region shown in Figure 4.3b for a second 
degree case. The corresponding regular form expans ion, however , 
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requires variable upper limits on the summations to cover 
the equivalent kernel space as shown in equation (4.6b), and 
Figure 4.3c for a second degree kernel. 



y (k) 
^m 



N N-h, N-h 

£ E --L 

v° v° v° 



1 



a 



r 




•h )u(k-h, )u(k-h, -h 0 ) 
m 1 13 



• i . u(k-h, - . . . -h ) 
1 m 



(4.6b) 



Thus it is seen that only half the field needs to be covered 
by the triangular and regular expansions to identify the 
kernel associated with the square field of the regular ex- 
pansion. When the regular form expansion is used with con- 
stant finite upper limits there is no inherent reason to 
make all the upper limits equal since there is no symmetry 
in the kernel. Considering the regular expansion 



y (k) 
m 




a (h, . • .h )u(k-h, ) . . . 
rim 1 



(u(k-h, - . . . -h ) 
1 m 



(4.7a) 



a triangular expansion given by 



N, h +N h m ,+N 
1 L, 2 i^l m 

(k) = L L • • • L 

n, =0 n 0 =n, n =n m , 
1 2 1 m m-1 



a. (n, . . .n )u(k-n, ) . . .u(k-n ) 
t 1 m 1 m 



(4,7b) 



is required to cover the corresponding field. For a quadratic 
expansion this is shown in Figures 4.4c and 4.4b, The equi- 
valent region in the symmetric field is shown in Figure 4.2a. 



157 




rC 




C 




c 

o 

•H 

CO 

C 

m 

a 

x 

w 

03 



3 

b0 



a> 


03 


PH 


i — ! 




to 

ne 






a 


b0 O) 


V-/ 


C ^ 

*H 

T3 O 




C *H 

O 

ap 
co a) 


a 


a> £ 


0 


£ 


•H 


>> 


CO 


O CO 


C 


a 


03 


a) 


a 


co p; 


X 


a> p 


w 


rH 

£) C 


u 


03 *H 


03 


•H 


rH 




a 


03 i — 1 


b0 


> a> 


c 


•H 


o3 


P P 


•H 


0 




a> 


Eh 


O) 




bo nj 




c a 




o3 a* 


A 


co 




co 




p 


c 


O) 


0 




•H 


3 


CO 


bO 


C 


•H 


03 


P 



a 

x 

w 



a 

•H 

P 

0) 

S 

s 

>> 

co 



158 



space . 





1 c;q 



Figure 4.4. Range of variables corresponding to 
rectangular field in the regular 
kernel space. 



Therefore, identifying a rectangular kernel in the regular 
kernel space is equivalent to obtaining a symmetric kernel 
in the arrow shaped region of Figure 4.4a. 

Which type of expansion is more appropriate for a given 
system depends on the shape of the kernels for that system. 
For example, if a quadratic nonlinear system has a kernel 
with a relatively square shape in the symmetric kernel 
space, a regular form expansion with constant upper summa- 
tion limits will have to estimate many zero valued terms 
and is inefficient. On the other hand if the system has a 
kernel similar to the arrow shaped region of Figure 4.4a 
in the symmetric space, a regular expansion will be efficient 
while a symmetric expansion over a larger field would be 
required with many zero valued parameters . Not enough is 
known about how to relate types of nonlinear systems with 
various shaped kernels however, and the question of kernel 
shape and the best type of expansion is not pursued further 
here . 

1 . Lattice Filter Methods For Volterra Modeling 

Lattice filter methods derived earlier can be applied 
to obtain a minimum mean square equation error solution for 
the Volterra model if the regular form of the expansion is 
used. The Volterra model can be put in the form of a mul- 
tiple input single output MA model by defining a new family 
of signals as nonlinear combinations of delayed values of 
the system input u(k). 



u, . . ., (k)=u(k) u(k-h„) u(k-h 0 -h 0 ) , . . u(k-h 0 - . . , -h ) 

h 0 h 2 2 3 2m 

2 m 



(4.8) 



For finite summations the regular form of the expansion 
becomes 



N N ^ 

mm m2 

y (k) = £ ... V 

m h =0 

m 



h 2 = 0 



N . 
ml 

Z a (h, . . .h )u, . • (k-h, ) 

n r 1 m h„ n 1 

n^=0 2 m 



(4.9) 



and can be regarded as the sum of the outputs of a large 

number of linear filters whose inputs are given by u^ . ..^ (k) . 

2 m 

Equation (4.9) is exactly the form of a input, single 
output MA model where 



Q. = (N 0 +l)(N ,+l)...(N +1) (4.10) 

m2 m3 mm 

Furthermore, if the same upper limit on the summations over 
h^ is used in each of the various m-th degree systems, 

(^11 = ^21 = ’ ' ' =N ml) ’ " t ^ ie overa l 1 Volterra model given by 
equations (4.1a) and (4.9) is in a form suitable for solu- 
tion via the multichannel Levinson algorithm or the multi- 
channel MA lattice methods . The requirement to use the same 
upper limit on all summations over h^ arises because the 
Levinson algorithm and lattice methods assume the same amount 
of memory in each channel of the model. 

As a specific example consider a second degree ex- 
pansion where N ]_i =N 2 l =N l and N 21 = N 2* 
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N. 



n 2 n x 



y(k) = T a (h )u(k-h ) + 23 23 a (h,h 9 )u Ck-h, ) 

h x =0 r 1 1 h 2 =0 h 1= 0 r 1 2 h 2 1 



(4.11a) 



u, (k) = u(k)u(k-h 0 ) 
h 2 2 



(4 . lib) 



Defining data and coefficient vectors for each channel given 
by 



u3 (k) = [u, (k)...u, (k-N,)]^ 

— h 2 h 2 h 2 1 



(4.12a) 



*h. 



Ca r (0,h 2 ) . . .a r (N 1 , h 2 )l 



(4.12b) 



and embedding them into single data and coefficient vectors 
written as 



xhk> = [uhk) T ! h(k> T ' 



— o 



i • • • i 



u* (k) T ] (4.12c) 

2 



T l T 
r + 1 1 + 
[a | a n 

l u 



T T 



-N, 



i i 



( 4 . 12d ) 



the equation for the Volterra model output becomes 



J(k) = X + (k) T d + 



(4 . 12e ) 
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which is clearly of the same form as equation (A. 24a) and 
represents a ^ + 2 input, single output MA model. All the 
nonlinearities in the Volterra model are external to this 
MA model, in the formation of its various input signals from 
the system input u(k) . This model is illustrated in Figure 
4.5. 

It is interesting to consider what the recursive in 
order nature of the Levinson algorithm or lattice methods 
mean to the nonlinear Volterra problem. In building up the 
MA model solution recursively in order, the upper limit of 
the summations over h^ is increased until the desired value 
is reached. In terms of the Volterra model kernels, this 
means allowing each of the regular form kernels to grow in 
size in the h^ dimension while holding their boundaries fixed 
at prespecified values in all the other dimensions. In the 
linear MA model, the kernel has only one dimension and there- 
fore the recursive in order solution eliminates any require- 
ment to prespecify its upper limit (the order of the model). 
For the higher degree nonlinear kernels, these- methods reduce 
by one the number of kernel boundaries that must be pre- 
specified for each kernel. Allowing the kernel to grow in 
any of the other M-l dimensions (h^ through h ) corresponds 
to adding additional channels to an existing lattice and 
simple methods of accomplishing this are not available. 
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Figure 4.5. A second degree nonlinear Volterra 
model + in multichannel MA form. 

The at represents are coeffi- 
cients ^of the single input single 
output MA models shown. 
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B. NONLINEAR ARMA MODELING 



As was previously mentioned, the primary difficulty 
associated with the Volterra series model arises from the 
fact that it is a nonlinear generalization of the MA model 
and as such, a large number of terms may be required to 
accurately represent even a mildly nonlinear system. In 
linear system modeling this difficulty was remedied by 
using the more general ARMA model. It is reasonable to 
assume therefore that a nonlinear generalization of the 
ARMA model could remedy the problem in the nonlinear 
modeling case (for at least certain types of nonlinear 
systems). Such a generalization called the nonlinear ARMA 
model has recently been proposed. [Ref. 64] This model 
forms an estimate of the current value of the system output 
as follows: 
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oo oo 



+ ^ C(n^m^)u(k-n^)y(k-m^)+. . 

n^ = 0 m, =1 




n.=0 n =0 m =1 
1 p 1 



OO 




q 



. . . y (k-M ) 

q 



(4.13) 



The first three terms of equation (4.13) are a discrete 
Volterra expansion of the input signal u(k) and represent 
F 10 [u(k)] in terms of the discussion of Chapter I. The 
second three terms are a discrete Volterra expansion on the 
system output delayed one sample interval and represent 
F 2 Q[y(k-l)]. The final two terms are bivariate expansions 
of the system input and delayed output and represent 
F 3Q [u(k) , y(k-l)]. (This is the first and only model con- 
sidered where F ^ g C ' 3 is not assumed to be zero.) Equation 
(4.13) is clearly a nonlinear extension of the linear ARMA 
model contained in the first and fourth terms. As was the 
case in the Volterra model} the number of multiple summations 
required in (4.13) is dependent upon the nature of the non- 
linearity in the system being modeled. The upper limits on 
the summations in (4.13) have been written as infinity in- 
dicating a requirement for infinite memory or model order. 

As is discussed subsequently, however , the required model 
order (memory) may in fact be finite due to the nature of 
the system being modeled, thereby alleviating the difficulty 
encountered in the Volterra nonlinear model previously presented. 
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The kernels of the input expansion and output expansion 
a^C*) and t> s (*) are symmetric since any permutation of the 
indicies results in the same value for the kernel. They 
have therefore been labeled with a subscript "s" and can 
also be written in triangular and regular form. 
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( 4 . 14 ) 
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m =0 m 0 =m, m =m .. q q 

1 2 1 q q-1 



00 00 

= ^ ... 2 b (h, . . .h )y(k-l-h, )y(k-l-h, -h 9 ) 

h, =0 h =0 r x q x 

i q 



. . . y (k-l-h, - . . . -h ) 

1 q 



( 4 . 15 ) 



167 



In writing the triangular and regular forms in equation (4.15) 
the lower index on the summations has been shifted to zero. 

In the case of the bivariate expansion terms in equation 
(4.13) the kernels do not possess symmetry so that trian- 
gular expansions are not possible but regular form expansions 
are possible. 



00 OO GO 00 
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h , -o 
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1 q+1 1 p+q 

(4.16) 

Thus two regular forms are possible for the bivariate expan- 
sions. Figures 4.6 and 4.7 illustrate the manner in which 
the regular form of the bivariate expansion covers the field 
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Figure 4.6. Range of variables for a rectangular 
field in the original form of the 
bivariate expansion of degree two. 
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of model kernels for a second degree case for finite upper 
limits of and ^ on the summations . It is interesting 
to note that because of a lack of symmetry in the original 
form of the bivariate expansion, the causal region in the 
regular form extends outside the quarter plane. 

As was the case with the Volterra model, it will be 
shown that in regular form, a minimum mean square equation 
error solution can be obtained for the nonlinear ARMA model 
coefficients using either the Levinson algorithm or the 
lattice methods. This will provide the nonlinear generali- 
zation of the results presented in Chapter III on linear 
ARMA modeling. Before developing this method of solution, 
however, the applicability of the nonlinear ARMA model to 
various types of systems and its memory requirements will 
be considered. 

1 . Identif iability Conditions and Memory Requirements 
In the previous chapter on linear ARMA modeling it 
was stated that there were two facets to the question of 
model identif iability ; input signal requirements and mea- 
surement requirements. The question of measurement re- 
quirements was not discussed, however , since it was assumed 
that all signals (input and output) were observed. In the 
study of systems comprised of interconnected linear and 
nonlinear subsystems, however, various internal signals exist 
and the effect of either observing or not observing them on 
the modeling process must be explored. To do so, it will 
be assumed that the system under study can be put into the 
form of Figure 4.8 fulfilling the following equations. 
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x L (k) 


= y^Ck) + 1c ) (4.17a) 


AT 
✓— \ 

' 


= ^2 Xi/ k ^ + C^H. (k) (4.17b) 


^ (k) 


= F Cx (k) ] ( 4 . 17c) 

— — n 


y L ( k) 


= T Cx L (x)] ( 4 . 17d ) 


where 

x L (k) 


= [x^-^(k) ... x^p(k)]^ (4.18a) 


a 


vector of inputs to P linear subsystems; 


• £ N (k) 


= [x N1 (k) ... x N p(k)]^ (4.18b) 


a 


vector of inputs to Q nonlinear memoryless subsystems 


*L (k) 


= CyLp(k) ... y L p(k)]^ (4.18c) 


a 


vector of outputs from P linear subsystems; 


yn<k) 


= Cy N1 (k) ... y N( ^(k)] ( 4 . 18d ) 



a vector of outputs from Q nonlinear memoryless 
subsystems . 
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Figure 4.8. A General Nonlinear System 
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The z - transforms of these signals are also defined as 
X^(z), X^Cz), Y^Cz), Y N ( z ) . £^ , £ 2 ’ £]_ an< ^ £2 are matraces 

whose elements are either 0 , -1 or +1 indicating the 
interconnections of the various subsystems. T[ . ] and F[ . ] 
are diagonal matrices specifing the linear and nonlinear 
subsystems as follows. 

Y T (z) = T ( z ) X T (z) (4.19a) 

— Li — — Li 



where 



T ( z ) = 



[T. .(z)] 
19 



A . ( z) 

1 

1-B. (z) 
1 









(4.19b) 



and 



N 



A. ( z) = 2 a . (n) z 

1 n= 0 1 



-n 



(4.19c) 



B 



N 

.(z) = 2 b.(n) z" n 

1 n=l 1 



( 4 . 19d) 



y N (k) = I tx N (k)] = 



F 1 [.] 



f 2 C.] 



f q [.] 



x (k) 

1 

* 

x M ’ ( k ) 

n 2 



x NQ (k) 






F Q*- X NQ (k) 



(4.20) 



i n it 



This overall system given by Figure 4.8 is adequate to re- 
present a broad class of systems comprised of intercon- 
nections of linear and nonlinear subsystems including 
cascades, parallel connections and feedback systems. 

Equations (4.19) and (4.20) assume all the subsystems 
are single input single output noninteracting systems. If 
desired, the collection of linear subsystems given by 
T(z) can readily be put into the general multichannel ARMA 
form of Section III.F to allow each output to be a function 
of past values of all outputs , and past and present values 
of all inputs. 

Alternate representations of these linear and non- 
linear subsystems are also useful. Equations (4.19) are 
equivalent to 

Y t (z) = [A.(z)]X t (z) + [B.(z)]Y t (z) (4.21a) 

— Li 1 — Li 1 Li 

or in the time domain 

y T (k) = [a . (k) ]*x T (k) + [b . (k) ]*y T (k) (4.21b) 

Here * represents convolution and is carried out in the 

same sense as matrix multiplication and the matrices, 

Ca.(k)], Cb.(k)], CA.(z)] and [B.(z)l are diagonal matrices 

whose i-th entries are the time domain functions a^(k) or 

b^(k) or the polynomials A^(z) or B^(z). The time domain 

functions a.(k) and b.(k) are the inverse transforms of the 
x x 

polynomials A^(z) and B^(z). This can also be represented 
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in nonrecursive form as 



y^(k) = Ch^(k) ]*x^(x) 



(4.22a) 



where [h^(k)] is a diagonal matrix of impulse responses 
defined by 

oo 

h.(k) = 2 h . (n) 5 (k-n) (4.22b) 

1 n= 0 1 

The nonlinear systems can also be represented in terms of 
inverse functions assuming they exist over the necessary 
ranges of the variables so that 





f” 1 !! • ] o 




y Nl (k) 


11 

1 — 1 
/ — ^ 

1 1 

1 — 1 

1 

M 

m 

Vw/ 


0 'f' 1 :-] 




W k) 



F i 1 [y Ni (k)] 

♦ 

F Q ^- y NQ (k) ^ 



(4.23) 



So that equations (4.17) are iterative it is 
necessary that no delay free loops exist in the system of 
Figure 4.8. A necessary and sufficient condition for the 
absence of delay free loops is developed in Appendix H and 
requires that the terms of the determinant of the matrix 
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a = Lim £ 2 T (z) £ 1 (4.24) 

z->«> 

and all its principal minors must be zero. 

The various signal vectors and equations (4.17) can 
now be partitioned as 

u(k) (4.25a) 

u (k) (4.25b) 

(4.25c) 

(4 . 25d) 

Equation (4.25c) can also be written in terms of inverse 
functions as in equation (4.23), and (4.25d) can be written 
using either recursive or nonrecursive representations of 
the linear systems as was done in (4,21) and (4,22). The 
single primed signal vectors in equations (4.25) represent 
those signals which are observed and the double primed signals 
are those which are not. It is assumed that all input signals 
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*N Ck ? 
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i b [ 
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£ 2b 
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-x’(kf 
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in u(k) are observed. It is possible to rewrite equations 
(4.25) as a single composite matrix equation as done in 
(4.26a). As written, equation (4.2 6a) is an infinite memory or 
model order representation because of the presence of the 
h(k) ; * operators but a finite memory version can be written 
by expressing rows 4 and 8 as 

y' (k) = a (k)*x’ (k) + b (k)*y ' (k) + a, (k) *x!' (k) 

*-L —a — L —a — b — L 



+ b, (k) *y" (k) (4.26b) 

— b — L 

y£(k) = a Q (k) *x^(k) + b c (k)^(k) + a^(k)*x£(k) 

+ b^(k)*y£(k) (4.26c) 

The third and seventh rows can also be written in terms of 
inverse functions if desired as 

x^(k) = F~^[y^(k) ] + F ~ [ y ( k ) ] (4.25d) 

xJJ(k) = r^y'lk)] + ^[^(k)] (4 . 26e) 

Now the problem of writing a system of iterative equations 
for the observed signals in terms of only observed signals 
consists of rewriting equation (4.26a) so that the upper 
right 4 by 5 partition is a null matrix. If this can be 
done, the general form of the nonlinear ARMA model can be 
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used to identify the composite effects of the operators 
appearing in the upper left 5 by 5 partition. In some cases 
this will only be possible of the infinite memory version of 
the model (with h(k)* operators) is used and in other cases 
a finite memory model will suffice, 



memory nonlinear ARMA models can be used to identify a given 
system is illustrated for two examples in Appendix I. First 
a system consisting of a cascade of linear and nonlinear 
systems is considered. Then a model of the tracking behavior 
of a phase lacked loop is put in the form of a nonlinear ARMA 
representation . 

2 . Lattice Filter Methods for the Nonlinear ARMA ’Model 
As was the case with Volterra modeling, lattice 
filter methods can be applied to the nonlinear ARMA modeling 
problem if the regular form of the kernels is used. A 
family of signals is defined as nonlinear combinations of 
delayed values of y(k) and u(k) as follows. 



The process of determining whether infinite or finite 




u(k) u(k-h 0 )u (k-h 0 -h 0 ) . . . u (k-h 0 - . . . -h ) 
l 15 l m 



(4.27a) 



y h h (k) - y(k-l)y(k-l-h 2 )y(k-l-h 2 -h 3 ) 



. , .y(k-l-h„- . . . -h ) 

2 m 



(4.27b) 
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u, , y, , (k) - u(k) u(k-h 0 ) . . . u(k-h,- . . . -h ) 

V*‘ h p h p+l'** h p+G 2 2 P 



y(k-l-h 2 -. . >-h p+1 ) 



. . .y (k-l-h, - . . . -h ) (4.27c) 

y 1 p+q 



or alternately 



y 



h 2 -..h q Uh q+1 --- h q+p = y(k-l)y(k-l-h 2 ) . . .y(k-l-h 2 -. . .-h ) 



u ( k— h 2 - • • . -h^ +q ) , . .u(k-h 2 ~. . . -h q+ p ) 



(4.27d) 



With finite summations, the regular form of equation (4.14) 
becomes 



N P>P 

2 

h =0 
P 



N P,2 

2 

h 2 = 0 



Vl 

2 

V° 



a (h, ...h ) u, , (k-h, ) 

r 1 p h 2 • • .h 1 



Equation (4.15) becomes 



(4.28a) 



M q>q 

2 

h =0 

q 



V2 

2 

h 2 = 0 



n 



q,l 

V b (h, . . .h ) y, , (k-h, ) 

■o rl q- , h 9 ...n 1 



h,=0 



(4.28b) 
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and equations (4.16) become 



p+q,p+q 

2 

h p+q"° 



J p+q,2 

2 

h 2 = 0 



L 



p+q,l 



h^ = 0 



C rl (h l * 



. h . ) u, 

p+q h 



2 * * ‘ h p 



or 



y, , (k-h ) 

J h , , . . .h , 1 

p+1 p+q 



(4.28c) 



J q+p,q+p 

2 



h q+p“° 



J q+p,2 

2 

h 2 = 0 



J q+p,l 

2 C r2 (h l’ * ,h a+p ) y h 9 . . .h 

h 1 =o 2 q 



u, , (k-h, ) 

n . t . . . n . -L 

q+l q+p 



(4 . 28d) 



These terms can be viewed as the summation of the outputs 
of a large number of linear filters whose inputs are the 
signals defined in equations (4.27). In the context of 
multichannel filtering, each of these filters can be con- 
sidered as a single channel and each of the input signals 
can be associated with one of the channels. Since the 
lattice models use the same amount of memory in each channel, 
the upper limits on all summations over h^ will be made the 
same (N n =N ,=L, , = N, ) . The upper limits on the 

p,l q , 1 p+q,l 1 ^ 

summations over the other indicies determine the number of 
channels required. 
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For a quadratic nonlinear system, the present value 
of the system output is estimated as 



y(k) 



N 



1 ”22 

2 a r- (h i )u(k " h i ) + 2 



N. 



h 1 = 0 



h 2 = 0 



S a (h, ,h 0 )u, (k-h, ) 

r 1 2 h 2 1 



h 1 = 0 



N 1 M 22 N 1 

+ 2 b r (h 1 )y(k-h 1 ) + 2 2 b r (h l ,h 2 )y h (k_h i ) 

h 1 =l h 2 =0 h x =0 2 



2 2 

+ 2 

h 2 = 0 



N. 



2 C ri (h i’ h 2 ) u ^ (k ” h i ) 



h x = 0 



(4.29) 



where u^ (k) = u(k)u(k-h 2 ) , y^ (k) = y(k-l)y(k-l-h 0 ) 

and u y^ (k) = u(k) y(k-l-h 2 ) . Signal and coefficient vectors 
can be defined for the various quantities in equation (4.29) 
following the conventions established earlier; for example 



and 



T 

y(k) = Cy(k-l) ... y(k-N 1 )] 

?h 2 (k) Cy h 2 (k) ••• y h 2 <k - M i )] 

T 

b = Cb (1) . . . b (N, )] 

— r r 1 

+ T 

b h = Cb r (0,k 2 ) ... b r (N lS h 2 )] 



(4.30a) 
(4.30b) 
(4.30c) 
(4 . 30d) 
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Embedding all these vectors into single data and coefficient 
vectors, equation (4.29) becomes 

y(k)=_X^(k)^d^ (4.31a) 

where 

X x (k) = [y (k) T 



and 

T 

, r.T + 1 

d^ = [b a 



The minimum mean square equation error solution for the 
model coefficient vector d^ is given by 

e{ X q(k) X 1 (k) T } d 1 = e{X x (k) y(k)} (4.32) 

where the equation error is defined as 
e(k) = y(k) - y(k) 



+ T 
u (k) 1 



+ T ■ 

Yn< k > , 



+ 

• y m 



y M (k) 

22 



+ T ! 

H 0 (k) I • 

l 



i + 

I h. n 
, 2 2 



(k) 






b +T ' 

-o ! 



+ T . 

-o ! 



+ T . 

-o < 



+ T 
(k) 1 



I + T 

I uy r (k) 1 ] 
~ l 22 



£ 



Zn 



+ 



22 

T 

22 

T 

22 



(4.31b) 



(4. 31c) 
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While equation (4.32) is similar in form to equations 



(A. 7a) and (A. 26a) for AR and HA models, it lacks the nec- 
essary form for the application of the Levinson algorithm 
because of the fact that some component data and coefficient 
vectors are N-^-vectors while others are (N^+l) -vectors . 

This is the same problem that arose in linear ARMA modeling 
and can be handled here in a similar manner. Defining 



ip = [1 -a (0) - b (0,0) ... 

— r r ’ 


■ - b r (0 > M 22 ) 


- a (0,0) 
r 


-V°.u 22 ) 


- c , ( 0 , 0 ) 
rl 5 


T 

-C rl^’^22^ 



(4.33a) 



and 




x(k) = Cy(k) u(k) y Q (k) ... 


y M (k) 
22 


Uq ( k) . . . 


u. r (k) 
N 22 


uy^ (k) 


T 

uy. (k)] 
L 22 




(4.33b) 



and assuming that the coefficients in 4> can be estimated in 
some other manner, the MMSE solution for the remaining 
model coefficients becomes 
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(4.34) 



e { X (k) X(k) T } d = e{_X(k) x(k) T } ip 

Here X(k) and d are defined as X-^(k) and d^ with all 
superscripts "+" removed from their component vectors (indi- 
cating they are all indexed from 1 to and are all N^- 
vectors). Note that just as in the linear ARMA case, the 
coefficients in ip essentially correspond to gains on their 
respective channels. Comparing equation (4.34) with (A. 7a) 
for a multichannel autoregression it is clear that d can be 
obtained from the multichannel AR solution by 

d = JD rp (4.35) 

where the signals in x(k) comprise the channels in the auto- 
regression. Therefore, either the Levinson algorithm or the 
lattice methods can be used to solve for _D and from it and 
a knowledge of ip, the nonlinear ARMA model coefficients can 
be obtained. 

By analogy with equation (3.27) it also follows that 
the nonlinear ARMA equation error is related to the AR pre- 
diction error vector by 

e(k)=^e(k) (4.36a) 

so that 

e {e (k) 2 } = V 1 P ip (4.36b) 
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where P is the AR prediction error covariance matrix. The 
coefficients in x p can therefore be set to minimize the mean 
square value of the nonlinear ARMA equation error in (4.36b) 
resulting in a solution given by 



P 2 2 


* ' ' P 2N 




a ( 0 ) 
r 




P 21 


• 






, 


= 




P N2 


P NN 




c (0,L. > 
r ’22 




P N1 



(4.37a) 



where N is the total number of channels in the model 

N = 1 + 1 + (M 22 +l) + (N 22 +l) + (L 22 +l) 

and the p^ are the elements of the prediction error co- 
variance matrix. It is readily apparent that the linear 
ARMA model and its solution via the Levinson algorithm or 
lattice methods are a special case of this formulation of 
the nonlinear ARMA model just as one would expect. 
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V. APPLICATIONS, CONCLUSIONS, AND OPEN QUESTIONS 



In the previous four chapters, existing methods for AR 
and MA modeling were reviewed and from them new methods for 
linear and nonlinear ARMA modeling were developed. Here, 
two applications for these new methods in reduced order 
modeling and modeling for fault detection and diagnosis 
are examined briefly. Then the results of this research 
are summarized, conclusions drawn and significant open 
questions for the continuation and extension of this work 
are listed. 

A. APPLICATIONS 

1 . Reduced Order Modeling 

Oftentimes, complex physical systems, both linear 
and nonlinear, can be approximated quite closely using 
simple models. The lattice solution methods developed here 
provide a very natural and efficient means of determining 
reduced order models for complex, high order systems especially 
in the case of linear systems. In Chapter III it was argued 
that for linear systems it is reasonable to build up ARMA 
models by simultaneously incrementing the order of the numera- 
tor and denominator polynomials as the lattice method does. 

When this method is used to build up a, given order model, 

all lower order models and their mean square values of equation 

error are readily obtained as well (the only additional 



calculations needed are for the MSE and the gain term or 
gain matrix in the multichannel case) making it easy to 
compare the various models and decide if reduced order models 
provide sufficient accuracy. 

Consider for example the seventh order system whose 
characteristics are listed in Table 5.1. The magnitude 
spectra of second, third, sixth and seventh order lattice 
models obtained using batch processing with 4000 point 
averages are compared to that of the system in Figure 5.1. 

It is apparent that a second order model is unable to 
approximate the system well, however a third order model does 
provide a good approximation. Furthermore, increasing the 
model order to four, five and six fails to significantly 
improve its performance as evidenced by the sixth order 
plot in Figure 5.1c. The model accuracy is not significantly 
increased until seventh order (which corresponds to the order 
of the actual system) when a very good fit is achieved. This 
is further illustrated by the plot of the normalized mean 
square equation error as a function of model order shown in 
Figure 5.2. The cost drops rapidly going from second to 
third order but then fails to decrease significantly until 
seventh order is reached. The roots of the system and the 
various order models are also plotted in Figure 5.3. 

The benefits of the lattice method for reduced order 
nonlinea,r ARMA modeling a,re not quite so pronounced however, 
since adding extra stages to the lattice corresponds to 
allowing the kernels to grow in only one of their multiple 
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Figure 5.2. Mean square value of equation error (as 
a percentage of the mean square value of 
system output) vs, model order for lattice 
models of system E obtained using batch 
processing and 4000 point averages. 
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dimensions. (The linear "kernel" had only one dimension.) 

Here some a priori information would be useful in determining 
the values to prespecify for the boundaries of the kernels 
in the other dimensions. Still however, when compared to a 
brute force matrix inversion approach where all kernel 
boundaries must be prespecified, the lattice method provides 
a viable alternative in obtaining reduced order models 
especially for low degree systems. 

2 . Modeling For Fault Detection 

The problem of fault detection and possible diagnosis 
can be formulated as follows. 

a. Obtain a parameterization that describes the 
current functioning of the system under test. 

b. From this parameterization, determine if the 
system is functioning normally or if a fault 

has occurred by comparison with a fault dictionary. 
It is the first part of this problem that has been addressed 
in this work. The parameterization can be as simple as 
sampled measurements of the response to specific inputs 
however, the large volume of data that would generally be 
involved in such an approach would greatly complicate the 
second part of the problem- A more efficient approach in 
terms of utilization of parameters is to model the system 
and use the model parameters a,s a description of .its 
current functioning. 

For linear systems, AFMA models provide a very 
general framework with a number of possible parameter sets. 
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Three candidates are polynomial coefficients , root locations 

and lattice reflection coefficients with the latter offering 

many advantages . In addition to the advantages demonstrated 

by the experimental results of Chapter III, reflection 

coefficients provide a very effective and methodical way to 

build up knowledge of a system's characteristics. As model 

order is increased to more accurately represent a system, 

reflection coefficients already determined don't change 

making them ideal candidates for use in a dictionary lookup 

scheme (a characteristic not shared by the other parame- 

terizations ) . This is made more important since reduced 

order models could be adequate to detect and perhaps diagnose 

some faults, especially catastrophic ones. While more 

parameters are required when reflection coefficients are 

used, these same coefficients also provide all reduced order 

models. For a single channel ARMA example, 3N reflection 

coefficients and N gains provide all models from order 1 to 
2 

N while N +2N parameters would be required using either 
polynomial coefficients or roots to provide the same infor- 
mation. (6N reflection coefficients are needed if the input 
is white noise since k^ 2 = k 22 = 0 *) 

A similar argument could be made for the use of 
lattice reflection coefficients with the nonlinear ARMA 
model for fault detection and diagnosis of nonlinear systems. 
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B. CONCLUSIONS AND OPEN QUESTIONS 

The purpose of this research was to extend existing 
theories and methods in the modeling of linear and nonlinear 
systems to broader, more general types of models. After a 
discussion of available results in AR and MA modeling of 
linear systems with particular emphasis on the Levinson 
algorithm and lattice filter methods, model transition 
formulas were developed to relate the more general ARMA 
model for linear systems to the AR and MA models. It was 
shown that with suitable assumptions, the ARMA model solu- 
tion could also be obtained recursively in order using 
either a modified Levinson algorithm or lattice filter 
methods. These results were developed extensively in both 
theory and practice for single channel linear ARMA modeling 
with experimental verification of both the batch processing 
and adaptive lattice methods presented. Portions of these 
results have already been published. [Refs. 65 and 66] 

The theory was also developed to generalize these results 
to the multichannel ARMA case. 

Based on the simulation results it was concluded that 
the lattice methods offer the following advantages over a 
conventional brute force matrix inversion approach to ARMA 
modeling using windowed correlation estimates. 

1. For short runs of data the batch lattice methods 
provide much more accurate results than the brute 
force method . 
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2. The batch lattice method performs much better than 
the brute force method when the system is over- 
modeled , 

3. The MSE as a function of model order is well behaved 
for the lattice method. 

4. The adaptive lattice method has difficulty tracking 
zero valued overmodeled parameters. 

The cost of these advantages is the extra computational 
burden of passing the data through the lattice filter during 
the modeling process. 

In the discussion of nonlinear system models the Volterra 
model was considered as a nonlinear extension of MA modeling 
and it was shown that lattice methods could be used to obtain 
the model solution if the problem was recast, using the 
regular form of the Volterra kernels. Then the new nonlinear 
ARMA model was considered and it was shown that this repre- 
sentation in some cases solves the problem of requiring a 
very large number of model parameters encountered in Volterra 
modeling. Then lattice methods were developed for the non- 
linear ARMA problem and it was shown that the linear ARMA 
techniques presented earlier are a speical case of the non- 
linear ARMA methods. For both types of nonlinear models, 
the recursive in order nature of the lattice methods was 
shown to allow the various model kernels to grow in one 
dimension while holding their boundaries fixed at pre- 
specified values in the other dimensions. The use of the 
model nonlinear ARMA was also illustrated with two examples 
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and a nonlinear ARMA model was proposed for the tracking 
behavior of a phase locked loop. 

Several significant questions remain for the continuation 
and extension of this work and are listed here, 

1. Stability of the linear and nonlinear ARMA models 
must be considered. In the linear problem, stability 
is dependent on the roots of the demoninator poly- 
nomial of the synthesis model. The methods 
developed do not guarantee stability of the resulting 
model. (This was not found to be a problem in 
practices however , unless extremely short runs of 
data in the range of 30 to 50 samples were used. 

Even then, model instability was not a frequent 
occurrence.) Stability for the nonlinear ARMA model 
remains to be clearly defined. 

2. Input signal requirements in the nonlinear ARMA 
modeling process need to be investigated. In linear 
ARMA modeling the power spectrum of the input signal 
was found to play an important role. No requirements 
emerged however, on the probability density function 
(pdf) of the input. In nonlinear systems where the 
behavior is inherently level dependent , it is in- 
tuitively appealing to use an input signal with a 
flat power spectrum across the frequency range of 
interest and whose amplitude is uniformly distributed 
over the range of interest. 
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3 . The inability of the adaptive lattice method to 
track zero valued overmodeled parameters is an 
interesting problem warranting further consideration. 
If some means of detecting the degeneration of the 
cost surface towards an infinite trough can be found, 
the problem could be remedied by simply reseting the 
appropriate parameters to zero. 

4. Experimental experience needs to be gained with the 
nonlinear ARMA model itself and the lattice methods 
developed for it. 

5 . The char acteritics of the lattice solution methods 
need to be further quantified to gain a comprehensive 
understanding of how and why it performs as it does. 
Also, further comparisons should be made between 

the lattice methods and conventional methods. Some 
comparisons were made here for batch processing 
methods but only a rectangular window was used on 
the data. Comparisons should be made using other 
types of window functions in the brute force method 
and the adaptive lattice method should also be com- 
pared with a conventional L11S adaptive algorithm 
applied to the equation error model in which all of 
the a(i) and b(i) coefficients are adapted simul- 
taneously , 

6. In the adaptive lattice method, scaling of the lattice 
input signals needs to be investigated. It was 
noted that the ratio of largest to smallest eigenvalue 
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of the input autocorrelation matrix was related to 
the speed of convergence of the adaptive algorithm. 
For the first lattice stage this matrix is 



If the system has a high gain such that the mean 
square value of the output y(k) is very much greater 
than that of the input u(k) , convergence will be 
slow. This could be remedied by implementing an 
adaptive scaling scheme at the lattice input (perhaps 
similar to the first order low pass filter estimates 
used for adaptive step size normalization) . 




y(k) u(k) 




} 



u(k) y(k) u^(k) 
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APPENDIX A 



Alternate Multichannel Model Forms 



Multichannel generalization of the MA and AR models were 
discussed in Chapter II along with their solution via the 
Levinson algorithm. Here the multichannel models and the 
Levinson algorithm are developed in an alternate form more 
compatible with other linear and nonlinear modeling problems 
to be explored later. 

Consider first an N-th order, Q^-channel, AR model where 
the current value of the signal vector x(k) 



is to be predicted from a weighted combination of N past 
values of each of the component signals. For each signal 
this estimate can be written as 



T 



x(k) = [x.. (k) ... x n (k)] 
1 ^0 



(A. la) 



T 




(A. lb) 



x 




d . . (n) x . (k-n) 
J-3 1 



(A. 2a) 



or 




d. . (z) X. (z) 
13 1 



(A2 .b) 



where 



N 



= 7 ; d . . (n) z 

*rrr 13 

n=l 



d . . ( z) 
13 



(A. 2c) 



Define an N-vector for each of the channels to contain 
their required time histories as 

T 

x^(k) = [x^(k-l) x^(k-N)] (A. 3a) 



for 1 < i j< Qg and embed all of these vectors into a single 
NQg-data vector given by 



X(k) = [x-^Ck) 



x-. (k) T ] 



(A. 3b) 



y 

Define a NQ^ x matris of weights as 



D 



in 



V 



-IQ, 



~Q n Q 



0^0 



(A. 4a) 



where the N-vectors d^ are given by 



4ij = Cd i: (1> ••• d ij (N)] 



(A. 4b) 



and contain the coefficients of the polynomials d^(z). 
These polynomials can be combined in a matrix polynomial 
defined by 



D(z) = 



d n <z> 






d lQ 0 (2) 



V !) 



(A. 4c) 
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With these definitions, the N-th order prediction of x(k) 
and its associated prediction error vector becomes 



x(k) T = X_(k) T D . (A. 5a) 

e(k)^ = x(k)^ - x(k)^ (A. 5b) 



or in the transform domain 

E ( z ) T = X(z) T [I - D(z)] (A. 5c) 

Comparison of equation (A. 5c) and (2.44c) show that this 
matrix polynomial differs from the more generally used form 
of (2.44c) by a transposition. The coefficient matrix JD 
can be found by minimizing the trace of the prediction 
error covariance matrix 

P = e (e (k) e(k) T } (A. 6) 

leading to a system of linear equations given by 

e(X(k) _X (k) T ) D = £(X.(k) x(k) T ) (A. 7a) 

or 



R R 




r . , . r 


-x 1 x 1 -x 1 x Qo 




x i x i x i x q 0 


• • 




« « 


• 


D = 


• 


R « . . R 

— X~ — x~ X A 




r r 


Q 0 1 Q o Q 0 J 


I 


-V 1 



(A. 7b) 
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Adopting a shorthand notation, this becomes 



R D 



(A. 7c) 



As in the case of a single channel autoregression, the 
multichannel generalization of the Levinson algorithm also 
requires that the backward prediction problem of estimating 
x(k-N) backward in time from the values of x(k-N+l) through 
x(k) , be solved simultaneously. Using an overscore to 
indicate quantities associated with the backward problem it 
follows that 



— T — „ T 

X(k) D = x(k) 



(A. 8a) 



T T /s T 

e(k) = x(k-N) - x(k) 



(A. 3b) 



or 



and 



—l 



D 



T 

: X(Z) Z~^[I 


- 


T 




: [x, (k) 


X. 


—1 


— ( 


: [x . (k-N+1) 


. . . X . 


l 


1 


-11 


3lQ o 


\i ••• 


%*o 



(A. 8c) 



( A. 8d) 



(A. 8e) 



(A. 8f ) 
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• 1 • 



(A.8g) 



-ij = Cd ij (1) • ’ * d ij (N) ] 



w 



ith D(z) = 



d ll (z) 



d lQ 0 (z) 



d 0 i(z) ••• d n 0 



N 



and d . . ( z 

ID 



) = d . . ( n ) 

n = l J 



n 



( A , 8h) 



( A . 8 i ) 



Setting the coefficients of this backward predictor to mini- 
mize the trace of the backward prediction error covariance 
matrix 



P = e{e(k) e (k) T } 



(A. 9) 



leads to a MMSE solution given by 



s {X (k) X (k) } D = £ { X (k) x(k-N) T } 



(A. 10a) 



— — T T 

and since e{x.(k) x.,, } = R 

J i D 



and s{x.(k) x.(k-N)} = r 

— X *“‘1 — *X « X . 

J D 1 



this can be written as 
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(A. 10b) 



Adopting a shorthand notation, this becomes 



R D = r 



(A. 11) 



At this point it is important to take note of a subtle 
difference between the single and multichannel problems that 
has arisen. While the single channel equivalent of equation 
(A. 10b) for the backward predictor was not written previously, 
it is obviously 



R 

— x 




d 



r 

~ x l x l 



(A. 12a) 



and since 



r T 

- x l x l 



E x x 

x i l 



(A. 12b) 



it follows that the single channel forward and backward 
prediction problems have exactly the same solution (as was 
found to be the case in Section II. D. when the AR prediction 
error lattice was developed). This fact is responsible for 
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a number of simplifications in the single channel case: 

a. ) Only a single reflection coefficient k^ n+ D was 

needed to link the n-th and (n+l)-st order solu- 
tions for both the backward and forward problems 
(see equations (2.25b) and (2.25c)). 

b. ) | B ( z ) | = | B ( z ) | 

c. ) e{e(k) 2 } = e{e(k) 2 } 

d. ) The development of the Burg method and the Itakaura- 

Saito method for calculating the reflection 
coefficients are a direct consequence of c above. 
Unfortunately however, none of these simplifications carry 
over into the more general multichannel AR problem because 
by comparing equations (A. 7b) and (A. 10b) it is evident 
that in general D i D . 

Proceeding as in the single channel case, it is shown 
in Appendix B that because of the structure in the correlation 
matricies _R_ and _R , equations (A. 7c) and (A. 11) can be 
re-written as 

R_ _F = £_ (A. 13a) 

and 

JR X = £ (A. 13b) 

where F , F , P_ and £ are obtained from D, D , _T_ and 
T respectively by taking their component vectors and turning 
them upside down in place; i.e. 
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hi 



“ 1Q o 



^n 1 *" ^nQ 



CTO 



(A. 14a) 



where 



f . . = [d. . (N) ... d. . (1)] 
-ID 13 13 



and 



P 



— y x * * * £ x v 

11 X 1 Q 



0 



P . . . p 

X Q X 1 X Q X 0 



(A. 14b) 



(A. 14c) 



with 



— x . x . = [ R . (N) ... R (1) ] 

1 3 x i x j x i x j 



( A . 14d ) 



As will soon be evident, the relationships of equations (A. 13) 

form the cornerstone for the Levinson algorithm and are made 

possible because the blocks comprising the .R and R_ matricies 

T 

are themselves Toeplitz and because R = R 

^ — X • X . — X . X . 

13 31 

Assume that the (n+l)-st order solutions can be related 
to the n-th order solutions by 



, (n+1 ) 
d • • 

-13 



d (n) 




r e ( n) 1 


-13 
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-13 


0 




k! n+1) 







L 1 ] J 



(A. 15a) 



210 



and 



d< n+1) 

-13 



d (n) 




/''"N 

2 

1 


-13 




-ij 


I ^ 
[ 0 
1 1 




k‘ 1 ? tl) 
li~ 1 3 J 



(A. 15b) 



Embedding the vectors e^ 1 ?^ and and the coefficients 

-13 -13 

, (n+1) , ^-(n+l) . . . . , . , ^(n) , ^-(n) 

k. . and k.. into matrices designated C and C* , 

^(n+1) anc j ^(n+1) anc j solving for them in the (n_l)-st order 

problem it is shown in Appendix C that 



.(n) . _ Y (n) K ( n+ 1 ) 



(A. 16a) 



C (n) = _ k 



(n) rr(n+l) 



(A. 16b) 



K (n+1) = [R CO)- r Cn) D } ] [R (n+l)-/ 9(n) D (n) ] 



‘—XX 



—XX 



(A. 16c) 



K (n+1) = [R (o)- r 

— —XX 



(n) T D (n) ] - 1 [R (n+1) T . / 0 <n) T D (n) ] 

XX — 



(A. 16d) 



Also the inverted terms on the right hand side of equations 
(A. 16c) and (A.16d) are just the backward and forward pre- 
diction error covariance matrices for the optimum n-th 
order models so that (A. 16c) and (A.16d) become 



211 



(A. 17a) 



K (n+1) = P (nrl [R <n*l)-0 (n,T D <n) ] 

“XX " 



K (ntl) = P (nrl [R (ntl) T -/7 (n)T D (n) ] 

“ ■ — XX ' ■ ■ 



(A. 17b) 



As in the case of the single channel problem, the prediction 
error covariance matrices obey the recursions 



p(n+l) _ p(n) [i-K(n + l)]<(n + l) ] 



(A. 18a) 



p(n+l) _ p(n) j_j^(n+l)|r(n+l) -j 



(A. 18b) 



completing the multichannel generalization of the Levinson 
algorithm for AR models . 

Comparing equations (A. 16), (A. 17) and (A. 18) with their 
single channel counterparts (2.18) and (2.19) it is clear 
that the multichannel Levinson algorithm simply represents 
a matrix algebra generalization of the single channel al- 
gorithm. Once again, predictors of all orders 0 _< n _< N are 
obtained in the process of finding the N-th order predictor 
along with all their prediction error covariance matrices 
and the overall minimization requiring the inversion of 
a QqN x QqN matrix is replaced by a sequence of N minimiza- 
tions, each requireing the inversion of two x matrices 
(P (n) and P (n) ) . 
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Using the relationships given in equations (A. 15) and 
(A. 16), successive orders of matrix polynomials can also 
be related to one another by 

[I-D (n+ 1 ) (z)] = ci-D (n) (z)]-z- 1 z- n [i-D (n) (z)]K (n+1) 

(A. 19a) 



z- (n+ 1 ) [I-D (n+ 1 ) (z)] = z- 1 z- n [I-D (n) (z)]-[I-D (n) (z)]K (n+1) 

(A. 19b) 



Premultiplying both sides of these equations by X(z) , trans- 
posing, and transforming into the time domain provides rela- 
tionships between the prediction error signals at each stage. 

e (ntl) (k) = e (n) (k) - K <n+1> e (n) (k-l) (A. 20a) 

| (n+1) (k> -- | (n) (k-l) - K (n+1)T e (n) (k) (A. 20b) 

Recognizing that for a zeroth order prediction, the forward 
and backward prediction error vectors are just the input 
vector itself, it follows that 

e^^(k) = e^^(k) = x(k) (A. 20c) 

and the multichannel equivalents of equations (2.28) have 
been obtained. 
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Next consider the N-th order multichannel HA model in 



which the current value of a Q^-vector of output signals 
^(k) is to be predicted from the present value and N past 
values of a Q^-vector of input signals x(k) . 

Qi N 



A 



y 




/ 4 / 4 d..(n) x.(k-n) 
i=l n= 0 



(A. 21a) 



or 



Y . ( z ) 
3 



Qi 

/ , dt. (z) X. (z) 

A — ■* i] i 

i=l 



(A. 21b) 



where 

dt . ( z ) 
13 



N 

d . . ( n ) z n 
13 

n=0 




(A. 21c) 



Using a superscript "+" to indicate the fact that the vectors 
are indexed from 0 to N rather than from 1 to N, define 
(N+l) -vectors for each of the input channels to contain their 
required time histories as 



xt(k) = [Xj(k) 



T 



x. (k-n) 
3 



] 



(A. 22a) 



and embed these vectors into a single QHN + D-data vector 
given by 

T 

X + (k) = [xt(k) T ... x* )k) T ] (A. 22b) 

— -1 -Q. 
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Define a Q^(N+1) x matrix of weights as 



D* = 



ill 



V • 



-IQ 



0 



-Q,Q 



i 0 



where the (N+l)-vectors d^ are given by 



dt. = [d. . (0) ... d . . (N) ] T 
-13 id iD 



(A. 23a) 



(A. 23b) 



and contain the coefficients of the polynomial d^^(z). These 
polynomials can be combined into a single matrix polynomial 
given by 



Dhz> = 



d ii (z) 



d lQ 0 <z) 



(A. 23c) 



d Q.l Cz) 



d Q.Q„ <z) 

y i y 0 



With these definitions, the N-th order MA prediction of y(k) 
is given by 



y(k) 



Xhk) T D* 



(A. 24a) 
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or in the transform domain. 



Y(z) T = X(z) T JD + (z) (A. 24b) 

Defining a prediction error vector as 

e^Ck)^ = y(k) T - y(k) T (A. 25a) 

and setting the coefficients in _D + to minimize the trace 
of the prediction error covariance matrix 

P Q = e{e 0 (k)e 0 (k) T } (A. 25b) 

results in a solution given by 

£ (X + (k) X_ +(k)T} D + = £{X. + (k) y(k) T } (A. 26a) 

or 



— + + 
x l x l 


• ♦ • 


* + + 
x l x Q 1 


D + = 


£ + 
x i y i 


r + 

x l y Q 0 
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* t + 
X Q. X 1 


♦ ♦ ♦ 


R + + 

V<V 




£ + 
x Qi y i 


••• it 

% y %_ 



(A. 26b) 
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Adopting a shorthand notation this becomes 



R + D + = r + 



(A. 26c) 



Assume a relationship between the components of the 
(n+l)-st and n-th order solutions given by 



^+ ( n+1 ) _ 

-ij 







(n+1) 








+ 


(n+1) 


0 




a . . 



(A. 27) 



Embedding the vectors Y^j^ n+ ^ and the coefficients g|j 1+ '^ 
into matrices designated ^(n+l) and G^ n+ D , and solving 
for them in the (n+l)-st order MA problem it is shown in 
Appendix D that 



^(n+1) _ _p(n+l) g(n+l) 



(A. 28a) 



s (n+l) = pCn+l)' 1 ^ (n+l)-/" <n+ 1 )T D t(n) ] 

xy 



(A. 28b) 



— (n+1) — r(n+l) -(n+1) 

where _£_ /£ amd P/ are matrices , that emerge 

from the backward prediction problem in a multichannel (n+l)-st 
order autoregression on the input signal vector x(k). Again, 
it is clear that the multichannel MA solutions given by 
equations (A. 28) are a matrix algebra generalization of 
equations (2.23) for the single channel MA model. 
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To relate successive order MA matrix polynomials to one 
another, equations (A, 27) and (A. 28a) can be used to write 



D + (z) (ntl> = D t (z) <n) tz' n [I-P (n) (z)]G <n+1) (A. 29) 



where the second term on the right hand side that premul- 
tiplies G is the backward prediction error matrix polynomial 

from the autoregression on the input signal x(k) . Pre- 

T 

multiplying by X (z), transposing, and transforming into 
the time domain results in the multichannel equivalent of 
equation (2.37) 



y 



(n+l) 



(k) 



= ; (n) <k) ♦ G (n+ i ) V nt i) 



(k> 



(A. 30) 



and completes the derivation of the recursive in order 
solutions for the multichannel MA model. 
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APPENDIX B 



A Key Relationship For The Multichannel Levinson Algorithm 



Equation (A. 13) is a key relationship in the development 
of the Levinson algorithm responsible for much of the 
algorithm's simplicity. Without this relationship, more 
than just the K and K matrices would be needed to obtain 
the new model from the previous model. 

In equation (A. 7), consider the multiplication of the 
i-th block row of JR by the j-th block column of JD to 
form r 



-X i j 



R d n . + 

- x i X l"l3 



+ R d^ . = r 

— x . x„ — Q n j — x . x . 

1 Qg 0 J X ] 



(B.l) 



In particular, consider a general term on the left hand 
side of equation (B.l) in detail. 



R (0) 
x .x m 
x m 



R (1-N) 
x.x^ 
x m 



R (N-l ) R (0) 

x-x m x.x m 

x m x m 



d .(1) 
m 3 



d .(N) 
m 3 



R (1) 
x . x . 
i 3 



R (N) 
X i X j 



( B . 2 ) 



Define upside down versions of the d.. and r vectors 

-X3 - x iXj 

as 



d. . (N) 
13 



(B. 3a) 



d. . (1) 
13 
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Using these permuted vectors in equation (B.l) in place of 
the d and r vectors, the relationship is still satisfied if 
the R matrices are permuted as well. In particular, from 
(B.2) it is evident that 



R v (0) 
x . x ™ 

1 m 



R x.x ^ 
x m 



d. . (N) 
i] 



R 

x 



. X 



i m 



( 1-N ) 



R x.x < 0) 
i m 



d. . (1) 
ID 



R (N) 
x . x . 
i 3 



R x.x. (1 > 
1 3 



or equivalently 



(B.4a) 



R T f. R T 

- x i x I - 13 " x i x Q, 



^<4 = ^i x - 



3 



( B . 4b ) 



Embedding the f . . and £ vectors into matrices designated 

13 X ij . . _ 

F and respectively and using the definition of _K it 



follows that 



Defining f.. and p as upside down versions of their 

x 1 T\ T 

corresponding vectors in JJ and _T_ in equation (A. 11) and 
embedding them in matrices designated F and it also 

follows from a similar development that 



R F = /o 



(B. 6) 



Equations (B.5) and (B.6) are essential to the develop- 
ment of the Levinson algorithm and are made possible by the 
Toeplit? structure of the block components of the R and 
R matrices. 
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APPENDIX C 



The Multichannel Levinson Algorithm Solution 

For AR Modeling 



In the (n+l)-st order versions of equations (A. 7) and 
(A. 10), the component matrices that make up _R and JR can 
be written as follows 



R (n+1) 
- x i X j 



r <n> 

lx.x. 
1 0 



v x .x . ( -N) 
i 3 



x i x^(-l) 



R X.X. (N) • * ,R X.X. (1) , R 



i 3 



i 3 



x .X . ( 0 ) 

i 3 



R (n) 

XfX. 


1 p <n) 

1 -X i X j 




T 

p <n) 
-X.x. 

3 i 


7 

1 

1 R x . x . ( 0 ) 
1 i 3 



(C.l) 



and 



_(n+l) 

R x 
x i j 



(n+1) 

R 

— x . x . 
i 3 



T 

R (n) 

— X . X . 

1 3 



(n) 

. x . 
t 3 



-x.x 



(n) 

: . x 
1 3 



p 

-x.x. 



^x . X . ( 0 ) 

i 3 



( C . 2 ) 
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Additionally , 





(C. 3) 



R 



x .x . (n+1) 
i 3 



With these matrices and vectors written in this partitioned 
form, and with the relationships assumed in equations (A. 15) 
between the n-th and (n+l)-st order solutions, the (n+l)-st 
order modeling equations become: 

Forward Model: 



R(rO D (n) + R (n) £ <n) + /3 <n) K (n+l) = r (n) (C-4a) 




(C. 4b) 



Backward Model : 



J^(n) J)(n) + J^(n) T £ (n) + ^(n) ^(n+1) _ r (n) 



(C. 5a) 



/?(n) |)(n; + yo^ri) c (n) 





T 



R (n+1) 
-xx 



(C. 5b) 
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Equations (C.4a) and (C,5a) contain the n-th order modeling 
equations within them however, and therefore can be written 
as 



J^(n) ^-(n) _ _ ^(n) ^(n+1) 

^(n) "jr"(n) _ _ ^(n) ^(n+1) 

and applying the relationship developed in Appendix B 
(equations (B.5) and (B.6)) yields 

^-(n) _ _ "p"(n) K (n+1) 



C (n) = - F (n) K (n+1) 



(C.7b) 



Thus the e_ and £ vectors have been found in terms of the 
known quanties f and f and the unknown K and K matrices. 
Substituting equations (C.7) into (C.4b) and (C.5b) completes 
the solution resulting in expressions for the K and K 
matrices given by 



K (n+1) 




( 0 ) 



r (n) 




D^CRxxCntl) 



_/<3 (n)i D (n) ] 



(C.8a) 
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K (n+1) = [R xx (°>-_L (n) _D (n) ] [R xx (n+i) T -/<? (n) D (n) ] 

(C. 8b) 

The terms on the right hand side of these equations that 
are inverted are just the backward and forward prediction 
error covariance matrices, for the optimum n-th order models 
given by 

p (n) = R xx (°> - _L (n) D (n) (C.9a) 

p (n) = R (0) - r (n) D (n) (C.9b) 

— —xx — — 

Using equations (C.3), (A. 15) and (C.7), the forward pre- 
diction error covariance matrix for the optimum (n+l)-st 
order model can therefore be written as 

p (n+l) _ R (Q) _ r (n) T Q(n) + r (n) T p"(n) K (n+l) 

- R (n+l) T K (n+1) (C.lOa) 

—XX — 



= P (n) + [JT_ (n)T F_ (n) -R xx (ntl) T ]K (n+1) (C.lOb) 

= P <n) [I-P (n) " 1 [ R xx (ntl) T -/£ <n>T p (n) ]K <n+1> : 

(C. 10c) 



225 



(C.lOd) 



p (n+l)_ p(n) |- j_^(n+l) K (n+l) -j 



and following a similar development for the backward pre- 
diction error covariance matrix results in 



p(n+l) _ p(n)|--j. _ K (n+l)^(n+l) ^ 



(C.ll) 
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APPENDIX D 



The Multichannel Levinson Algorithm Solution 
For MA Modeling 



First note that because of the definitions of the x.(k) 

—x 

and x^(k) vectors (indexed from 0 to n and 1 to n respec- 
tively) in the n-th order models, the matrices R in 

x.x. 

the n-th order MA problem could also be written 1 in 
terms of x^(k) and x_.(k) as (assuming stationarity ) 



R (n) 

— + + 
x.x. 
a 3 



R (n+1) 

^x.x. 
i 3 



(D. la) 



so that 



R + 



(n) 



R 



(n+l) 



(D.lb) 



(n+l) 



(n+l) 



The components of R + and in the (n+l)-st 

order MA modeling problem can be written in partitioned form 
as 



R ( n+l ) 

- + + 
x.x. 
i 3 



II 


1 

X i X j 1 


-(n+l) 

—x.x. 
i 3 




7 

-(n+l) 1 ! 
•2-x . x . 

3 i 1 


^x . x . ( 0 ) 




1 3 



(D , 2a) 
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and 



(n+1) 
— + 

x i y j 



(n) 



£ + 



x^j 



R (n+1) 
x.y . 



(D. 2b) 



Using these partitioned forms, and the relationship in 
equation (A. 27) between the n-th and (n+l)-st order solutions, 
the (n+l)=st order modeling equations become 



^,(n) ,(n) _.(n) 

R* D* * R* ^ (ntl) 



+ / ^(n+l) G (n+l) = r + 



(n) 



(D. 3a) 



/^(n+l) D + + () + R xx (0)G < ' n+ '^ 



= R (n+1) 
-xy 



(D.3b) 



Equation (D.3a) contains within it, the modeling equation for 
the n-th order HA model and using equation (D.lb) can be 
rewritten as 



j^(n+l) ^ (n+1) _ _ ^(n+D^n+l) 



(D . 4 ) 



A comparison of this result with equation (C.6a) shows that 



J (n+1) _ p (n+1 )g (n+1) 



(D. 5) 
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where F_^ n+ ~^ ±s the permuted version of the backward 
solution in an (n+l)-st order AR model of the input signal 
vector x(k) . Furthermore, substituting equation (D.5) into 
(D.3b) results in a solution fop G^ n+1 ^ given by 

G (n+1) = [R (0) - ^ (n+1)T T< n+ l) :l ' 1 

■“XX 

[R (n+l)-/£ (n+ 1 )T D +(n) ] ( D . 6 ) 

xy 



Since 



^(n+l) T p (n+1) _ r (n+l) T J)(n+1) 

it follows that the inverted term on the right hand side 
of equation (D.6) is just the backward prediction error 
covariance matrix for the optimum (n+l)-st order AR model 
on the input signal vector x(k). 
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APPENDIX E 



Prony r s Method For ARMA Modeling 



Prony ' s method [Refs. 8, 52 and 56] obtains a zero pole 
model for a system by matching the impulse responses of the 
system and model over the first M+N+l sample intervals where 
M and N are the orders of the model transfer function num- 
erator and denominator polynomials. Assume that a signal 
y(k) is available that represents the impulse response of 
a causal system and that a rational transfer function model 
for this systems is desired. Using a to denote the 

model output and u(k) to denote the input to both the system 
and model, the model transfer function is given by 



H(z) 



Y(z) 
U ( 2 ) 



M 

l a(n)z 
n= 0 



1+ I b(n) z 
n=l 



(E. 1) 



For a unit sample input it follows that 



B(z) Y(z) = A(z) 



(E. 2) 



Equating like powers of z in this relationship results in 



N 

b(i) 

i= 0 




y(n-i) 



a(n) ; 
0 



0 < n < M 



else 



(E. 3) 



where b(0)=l. 
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Equating y(k) and y(k) over the interval 0 < k _< M+N produces 
a set of M+N+l equations which can be expressed in matrix 
form as 




(E.4a) 



or adopting a shorthand notation 



+ 

*1 


Si- 




1 




+ 

a 


^2 


ll 




b 




0 



Assuming that the NxN matrix is not singular if 
b = -y; 1 y 2 



(E.4b) 

follows that 
(E. 5a) 



zl - s x r 2 \ 2 



(E.5b) 
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The original Prony method goes on to form a partial 
fraction expansion and inverse transformation on the model 
transfer function H(z) resulting in a model for the impulse 
response of the system given as a sum of complex exponentials. 
This in unnecessary here however, since a rational transfer 
function model is the form sought. 

Prony' s method inherently assumes that matching a 
sufficient portion of the impulse response of the system 
results in an accurate model but this is not necessarily 
the case unless the impulse response damps out quickly or 
unless the system can be represented exactly by a low order 
model. Otherwise a very high order model may be required 
to obtain sufficient accuracy. Other difficulties asso- 
ciated with this technique are: 

1. ) The system impulse response must be available; 

2. ) There are no built in mechanisms to test for or 

ensure stability of the model; 

3. ) There is no averaging of noise in the data; 

4. ) Only a small portion of the available data points 

(M+N+l) are actually used. 

These difficulties can be partially overcome by modifying 
the procedure to obtain an approximate match of the system 
and model responses over their entire duration rather than 
an exact match over the first M+N+l points. Consider 
equations (E.4) modified to include the entire signal y(k) 
for 0 < k < oo. 



232 



(E. 6 a) 



y ( 0 ) | 0 

yd) | y( o) 



o 



l 



b 



+ 

a 



y(M) I y(M-l) 



y ( M+ 1 ) | y(H) 



Adopting a shorthand notation this becomes 



*1 I Si 




1 




+ 


f 






- 




1 




b 




0 


*3 S 3 






1 


1 1 



This yields two equations 
+ 2i k = a + 



(E. 6 b) 



(E. 7a) 



y 3 + y 3 * s ° 



(E , 7b ) 



but with"y 3 , and £ having an infinite number of rows, 
equation (E.7b) will in general have no solution. In practice 
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only a finite portion of the system impulse response y(k) 
can be considered but equation (E.7b) will still be incon- 
sistent in general. A least squares estimate of b can how- 

T 

ever, be obtained by minimizing e e where 

e = Y 3 b - (-y 3 ) (E.8a) 

resulting in 

fe = «323> 4 2 3 <E ' 8b) 

which in turn can be used in equation (E.7a) to find a + . 

This approximate version of Prony’s method is the one most 
commonly applied. 
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APPENDIX F 



Parabolic Surfaces In n-Dimensions 

Multi-dimensional parabolic surfaces are described by 
an equation of the form 

y = x^Ax -2x T b+c (F.l) 



where : 

y is the independent variable; 
x is a vector of dependent variables; 

A is a symmetric constant matrix; 

b is a constant vector; 

c is a constant. 

(x, b and c can also be considered as matricies with the 
trace of the right hand side set equal to the independent 
variable but the problem remains essentially unchanged.) 
Completing the square for nonsingular A this becomes 

y = (x - A -1 b) T A(x - A" x b) + c - b T A _1 b (F.2) 

so that for positive definite A it is clear that the minimum 

value of y is obtained for x = A""'*‘b and this minimum is 
T -1 

c - b A b. It is also clear that nonzero values of b and c 
simply raise and lower the surface and move the minimum 
point away from x = £. The shape of the surface (its 
relative concavity or flatness) is determined by the matrix 
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A in the quadratic term of equation (F.l). Therefore, to 
study the shape of this surface, consider the simpler pro- 
blem with b and c set to zero. 

y = x T A x (F . 3 ) 

One way to examine the relative flatness or concavity is to 

look at the locus of points on the surface for constant 

values of y; in particular when y=l. Recognize that A can 

T 

be rewritten as Q A Q where A is a diagonal matrix of 
eigenvalues and Q. is a matrix whose columns are the unit 
length eigenvectors of A. Now (F.3) becomes 

x T Q A Q T x = 1 ( F . 4 ) 

T 

and introducing a new set of variables w=Q x (which are 
simply a rotation of the variables in x) , equation (F.4) 
reduces to 

A..Wt + ... + A w^ = 1 (F.5) 

11 n n 

This equation describes an ellipsoid in n dimensions whose 
axes half lengths are given by 1//AT for 1 < i _< n. This 
follows from letting all but one of the w’s equal to zero 
and solving for the nonzero variable so that one point on the 
surface is for example w^ = 1/ /T^ with w 2 = ••• = = 0 . 
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This point is just a multiple of the first eigenvector of 
A so that in general, the axes of the ellipsoid point in 
the direction of the eigenvectors of A with half lengths 
given by the recriprccal of the square root of the eigen- 
values . 
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APPENDIX G 



Multichannel ARMA Modeling 



Consider a system with output signals [y^(k)...yp (j < )] 



and Q. input signals [u,(k) u^ (k)]. The multichannel 

ARMA analysis model forms an estimate of the present value 
of each output as a weighted combination of past values of 
all output signals and past and present values of all input 
s ignals . 

Q 0 N 

y n (k) = S 2 b jn (i> yyk-i) 



Q i N 



. S 2 a in (i 



j=l i=0 



. ) u. (k-1) 
jn 3 



(G . 1) 



Define data vectors for all the input and output channels as 



y_ n (k) = Cy n (k-.l) ... y n (k-N)] 



(G. 2a) 



u + (k) = [u (k) ... u (k-N)]^ 

— n n n 



(G. 2b) 



and embed them into Q^N and Q^(N+1) vectors given by 



Y = Cy x (k) 



r + sT 

U = Lu^(k) 






(k) 1 ] 



0 

+ T 
Hn ( k ^ 
^i 



(G. 3a) 



(G. 3b) 
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Define NQ q x Q q and (N+1)(K x Qg matrices of coefficients 
given by 



in 



b i. 



and 



B 






01 



^nQ 



0^0 



A+ = 



^11 



V 



^1, 



^Q ± Q 



o 



where 



b. . 
-ij 



Cb ij ( i) 



. b i -(N)] 



( G • 4a ) 



(G.4b) 



(G.4c) 



at . = [a. . (0) . . . a. . (N)] (G.4d) 

-10 10 10 



With these definitions, the multichannel ARMA estimate for 
the vector of output signals becomes 



y(k) 



T 



[Y J 



U ] 




Forming a prediction error vector as 



(G. 5) 



eg(k) = ^(k) - ^(k) 



( G . 6 ) 
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and setting- the model coefficients to minimize the trace of 
the prediction error covariance matrix results in a solu- 
tion given by 



-YY 


R + 

YU 




B 




~ 1 

K 

-Yy 


5. + 

U Y 


* + + 
u u 




1 

+ 

<1 

1 




R + 

U y 



( G . 7 ) 



In the transform domain, the prediction error model is re- 
presented by 

E o ( z ) T = Y( z) T [I-B(z)]-U(z) T A(z) (G.8) 

where E, Y and U are the transforms of the error, output 

and input vectors and the coefficients of the i,j-th elements 

of the matrix polynomials B(z) and A(z) are the elements of 

the vectors b.. and at.. The multichannel ARMA synthesis 
-13 -13 

model is then given by 

Y T (z) = U(z) T A(z)[I-B(z)]" 1 (G . 9 ) 

with the matrix polynomial fraction serving as the generali- 
zation of the zero pole transfer function. 
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APPENDIX H 



Delay Free Loops 

To develop equation (4.24) guaranteeing the absence of 
delay free loops in the system of Figure 4.8, consider the 
equation for y^Oc). 

y N ( k) = F[x N (k)] (H.l) 

Since F[ . ] is a memoryless nonlinear function, proving that 
x^(k) is not a function of y ^ ( k ) is equivalent to proving 
that y^(k) is not a function of itself, and therefore no 
delay free loops exist. From equations (4.17b) and (4.17d) 
with u(k)=0 it follows that 

X N (z) = L 2 I (z) Ii Xn ( z) (H.2) 

and the coefficient of y^ at time k on the right hand side 
is given by 

a = Lim £ 2 T(z) £ (H.3) 

z->°° 

A nonzero indicates a dependence of x ^(k) upon y^^(k) 

and clearly therefore all the main diagonal elements of a 
must be zero to avoid delay free loops. These elements are 
the terms of the (Q-l)-st principal minors of a. While this 
is a necessary condition it is not sufficient to avoid delay 
free loops since loops may exist through two or more signals 
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0 for all i,j such that 



The condition that a., a.. 

13 Di 

1 _< i,j _< Q and it j , ensures that no delay free locp exist 

through 2 signals and is equivalent to requiring all terms 

of the (Q-2)-nd principal minors of a are zero. A term of 

a determinant [Ref. 63] is defined as the product of elements 

of the matrix taken one from each row and one from each 

column and the determinent of the matrix is the sum of all 

possible terms. It is clear therefore that every term 

must contain a cycle such as a.. a.. ...a,, and must there- 
in jk li 

fore be zero. Since the determinant consists of every 
possible term, requiring that all terms of the determinants 
of the (Q-i)-th principal minors are zero ensures that no 
delay free loops exist through any combination of i signals. 
Examining all terms of the determinant of a and all its - 
principal minors ensures that all possible loops through 
the Q signals in x^(k) are examined. If any delay free loop 
exists, then at least one of the terms of one of the deter- 



minants will be nonzero. 



APPENDIX I 



Nonlinear ARMA Modeling Examples 



The determination of memory requirements for the non- 
linear ARMA model as well as its applicability to systems 
consisting of interconnections of linear and memoryless non- 
linear subsystems is illustrated here using two examples. 
First a cascade of linear and nonlinear subsystems is con- 
sidered. Then a real world example is considered and a 
nonlinear ARMA model is proposed for the tracking behavior 
of a phase locked loop. 



A. A CASCADE OF LINEAR AND NONLINEAR SUBSYSTEMS 

Consider the system shown in Figure 1.1 where the signals 
uCk) and Y^^k) are observed. In terms of the topology of 
Figure 4.8, seven signals can be identified (x T , , x T 0 , y T 

Li -L Li Z Li _L 5 

^L2’ X N1’ ^Nl ant ^ however for convenience three of the 
seven equations in (4.26) which specify 



x Ll^k) = u(k) (I. la) 

X L2 ^) = YNl (k) (I. lb) 

x N i(k) = y L1 (k) d.ic) 

will not be explicitly written. In this case, equation 
(4,26) becomes 
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*.y L 2 (k) 



u(k). 



A^( z ) 




1 — 1 

* 

t — 1 
i — 1 

pu, 




A 2 (z) 


1-B (z) 






1-B 2 (z) 



Figure 1.1. A Nonlinear System 



y L2 (k) 




u(k) 




y Ll (k) 




y Nl (k) 





b 2 (k) 



0 

0 



a. £ ( k ) '■ • 



a 1 (k)* | b 1 (k)* 0 



F 1 C*] 0 



y L2 (k) 



u(k) 



y L i (k) 



y Nl (k) 



( 1 . 2 ) 



where the finite memory representations of T-^(z) and T 2 (z) 
have been used. The objective now is to eliminate a 2 (k)* 
from the upper right partition. Using the equations of rows 
three and four, the first row can be rewritten as 



y L2 ^ =b 2 (k)* y L2 ^ k ^ +a 2 ^) '‘F^Cd-^Ck) *u(k) tb^Ck) *y L1 (k) ] 

(1.3) 

Notationally it is difficult to write equation (1.3) in the 
operator matrix form of equation (1.2) because of the non- 
linear function however, the following representation is 
adopted . 



M 




• o 


O 


l 

1 

1 


o 


O 1 


1 — 1 




1 

t 

l 










1 






• 




1 










i 










t 


•it 




/*“N 




t 


/-N 


1 — 1 


AC 




t 


AC 


• 






I 


S-/ 


5 1 


H 




l 


1 — 1 


1 — 1 




o 


l 

I 


A 


pH 






I 

i 










l 






► 




l 










l 






«J 5 




I 






/-N 




l 






AC 




I 






v_x 




i 


•it 




rH 


1 — t 


i 


/— N 


o 


rd' 




l 


X 




i—j 




i 


N_/ 




i — 1 




l 


rH 




pH 




1 


rd 




•it 
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No further reductions are possible to make the upper right 
partition a null matrix leading to the conclusion that for 
a finite memory nonlinear ARMA model to be appropriate, 
either b^(k) must be zero (the first linear system must be 
nonrecursive) or y^^(k) must also be observed. Alternately, 
if b^(k) is nonzero, an infinite memory representation can 
be used for the first linear system by replacing a^(k)*(.) 
with h^(k)*(.) and b^(k)*(.) with zero in equation (1.4) 
indicating that an infinite memory nonlinear ARMA model is 
appropriate when only u(k) and y^ 2 (k) are observed. 

B. A NONLINEAR ARMA MODEL FOR A PHASE LOCKED LOOP 

A continuous time model for the tracking behavior of a 
phase locked loop [Ref. 55] is shown in Figure 1.2 where: 
0^(t) is the phase of the incoming signal 

0 2 ( t ) is the estimate of the phase of the incoming signal 
e(t) is the phase error signal 

F(s) is the transfer function of the loop filter 
and are constants 




Figure 1.2. A nonlinear model for the tracking 
behavior of a phase-locked loop. 
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The model is nonlinear because of the sin function in the 
loop. Often the assumption is made that e(t) <<ir/2 so 
that sin e(t) 2 e(t) in which case a linearized model is 
obtained as 

0 2 (s) K K^FCs) 

0, (s) = K, K F(s) + s (I * 5 

1 1 2 

A nonlinear ARMA model for the system can be obtained by 
first discretizing the model of Figure 1.2 as shown in 
Figure 1.3 where F(z) represents the discrete loop filter. 




Figure 1.3. A discrete version of the phase- 
locked loop model. 



-1 

The integration has been approximated as =p . (It is 

l-z" x 

necessary to use this Euler forward approximation for 
integration rather than one such as the trapezoid rule to 
avoid a delay free loop) . Defining a single linear system 
as the cascade of FCz) and the discrete integration 



-1 

K 1 K 2 ~~TT F(z) 



A( z) 
l-B(z) 



( 1 . 6 ) 
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Equation (4.26) for the phase locked loop becomes 



0 2 (k) 
6 1 (k) 




b 1 (k)* 0 | a x Ck)* 0 

0 1 1 0 0 




0 2 Ck) 

0 x (k) 






- - - - T 






y N (k) 




0 0 | 0 sin( . ) 




V k) 


e(k) 




-1 1 1 0 0 




e(k) 



where it is assumed that 9^(k) and 9£(k) are observed. Using 
the relationships specified by rows 3 and 4, this can be 
written as 



0 2 (k) 




b x ( k ) * 


CD 




0 


y H (k > 




0 


e(k) 




-1 



0 1 0 0 

1 | 0 0 

-+ 

0 I 0 sin( . ) 

I 

10 0 



0 2 (k) 
9 1 (k) 



y N (k) 

e ( k) 



a 1 (k)“ 


sin[-( , ) 


(.)] | 


1 0 


0 




9 2 (k) 




0 


o 1 


1 0 


0 




e, (k) 




0 


0 1 

1 


1 0 

1 


0 




X 

y N < k) 




0 


° ! 


0 


0 




_e(k) _ 



( 1 . 8 ) 



from which it is clear that a finite memory nonlinear ARMA 
model is appropriate. The model can be obtained from the 
first row in equation 1.8 by substituting a series expansion 
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for the sin function and truncating it at the degree desired 
resulting in 



M 



0 2 (k)=b 1 (k)*O 2 (k) 



a^(k) 



m= 0 



TSnTT (9 2 (k )‘ e i (k) ) 2m+1 



(1.9) 



where 2M+1 is the degree of the series approximation to 
sin(.). (Note that lim A^(z) = lim B^(z) = 0 so that the 

Z-yco Z->-oo 

right hand side of equation (1.9) involves only past values 
of the output 0 2 (k).) An infinite memory Volterra series 
model for this system has been considered by Van Trees 
[Ref. 55]. 
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APPENDIX J 



Model Simulation Program Listings 

This appendix provides a listing of the fortran model 
simulation programs used in the experimental study of the 
lattice characteristics. Included are the main programs 
for the batch processing ARMA lattice, the adaptive ARMA 
lattice and the brute force model solution. Each main 
program is followed by a collection of subroutines used only 
by that specific main program. Then a collection of common 
subroutines called from two or more locations in the batch 
lattice, adaptive lattice or brute force method is listed. 
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